It’s not hard to understand the temptation of the data lake approach. It’s easy to set up and begin, the data is quickly accessible to many departments, and the shackles of the IT department are removed.
The idea is seductive, but there is an old adage that says you can pick 2 but never 3 from the following list: Good, Cheap, Fast. As it turns out, the greatest strengths of the data lake approach, Cheap and Fast, also happen to be its greatest flaws. Freedom of access and use give birth to questions of governance, privacy, and oversight. The untouched nature of the data and its lack of structure often leads to slow or unusable results. And the loss of insights due to the duplicitous work created and lost in silos may have the worst economic impact of all.
Enter the data warehouse. When designed correctly by the right engineers (the “Good and Fast” option), it is a stable, performant, long-term analysis-generating super machine. The modeled data becomes available for reuse whenever needed and tends to be secure, fast, high quality, and better for analysis and processing, especially over time. Data lakes have their place, but it’s few and far between. If you’re looking for a long-term solution that responds quickly when called upon and can grow with your business, we highly recommend the data warehouse approach. It may not seem like the cheaper option, but it doesn’t take long to find out why it is.