Data Warehousing is Still Alive

BLG01x - image EDOver the past several years many industry experts have declared the data warehouse to be dead. As early as 2011 Michael Hiskey made the case in blogs and when speaking at conferences.

Also in 2011, Philip Howard at Bloor Group proclaimed, “The EDW is dead. Period. Like a dodo. Like Monty Python’s parrot.” More recently, Stephen Smith’s July post on The Demise of the Data Warehouse generated a lot of interest and many comments.

“The reports of my death
have been greatly exaggerated.”
Mark Twain

——————————————————-

Yet, a 2015 survey conducted by Dimensional Research and cited by Snowflake Computing shows that:

  • 99% of respondents see their data warehouse as important for business operations
  • 70% are increasing their investment in data warehousing

I believe that the data warehouse is alive despite the many declarations of death. Nearly every enterprise has a data warehouse and many have more than one. So why do we see so many premature obituaries? They derive from the many struggles of data warehousing. Note that I say the data warehouse is alive. I don’t say “alive and well.” Data warehousing faces many challenges, but let’s not confuse challenged with terminal.

We’re sometimes too quick in technical fields to see new technologies as replacements for older technologies. COBOL for example was declared dead in the mid-1980s. Yet COBOL has a role today in healthcare for 60 million patients daily, 95% of ATM transactions, and more than 100 million lines of code at the IRS and Social Security Administration alone. And it’s not only COBOL. Just last week I read an article declaring death for the popular programming language Ruby. In 2013 SQL was declared dead, yet thousands of SQL job postings can be found on the web today.

So, please read these proclamations of data warehouse demise with a healthy dose of skepticism. The data warehouse is alive but it faces many challenges. It doesn’t scale well, it has performance bottlenecks, it can be difficult to change, and it doesn’t work well for big data. It certainly hasn’t lived up to the promises of the past. Smith is right that the single version of the truth still eludes us.

Data warehouses still meet the information needs of people and continue to provide value. Many people use them, depend on them, and don’t want them to be replaced with a data lake. Data lakes serve analytics and big data needs well. They offer a rich source of data for data scientists and self-service data consumers. But not all data and information workers want to become self-service consumers. Many – perhaps the majority – continue to need well-integrated, systematically cleansed, easy to access relational data that includes a large body of time-variant history. These people are best served with a data warehouse.

The data warehouse needs to be modernized. Migrating to the cloud resolves many data warehousing challenges. Scalability and elasticity are well-known cloud benefits. Cloud data warehousing also brings benefits of managed infrastructure, cost savings, rapid deployment, and fast processing. Data warehousing expectations need to be reset. It will never be the one-size-fits-all data repository, and the single version of the truth will continue to elude us. With realistic expectations, however, data warehouses become an integral part of comprehensive data management and information services architecture.

It is not my intent to discount or dismiss the views of others. Michael Hiskey is a smart man and an innovative thinker. Stephen Smith is a smart man and every conversation with him is thought provoking. I’ve not met Philip Howard but I assume from his research and writing that he has similar qualities.

Smith’s vision of DL + MDM is a good beginning. I think it needs to be extended to become DL + MDM + DW. Architectural views may vary: warehouse inside the lake vs. warehouse along side the lake, for example. Regardless of architecture, all three approaches – DL, MDM, and DW – contribute value and capabilities to adaptable and comprehensive data management strategy.

Don’t discount or decommission your existing data warehouse. Don’t relegate it to the legacy junk pile. You need it, but you need to modernize it. And while planning for modernization, think about next-generation MDM too.

Share

submit to reddit

About Dave Wells

Dave Wells leads the Data Management Practice at Eckerson Group, a business intelligence and analytics research and consulting organization. Dave works at the intersection of information management and business management, where real value is derived from data assets. He is an industry analyst, consultant, and educator dedicated to building meaningful and enduring connections throughout the path from data to business value. Knowledge sharing and skills development are Dave’s passions, carried out through consulting, speaking, teaching, and writing. He is a continuous learner – fascinated with understanding how we think – and a student and practitioner of systems thinking, critical thinking, design thinking, divergent thinking, and innovation. He can be reached at dwells@eckerson.com.

  • Martijn ten Napel

    The data warehouse architecture needs to be modernised. I keep repeating this: people confuse technology, technical implementations and the need for fully articulated data models for certain use cases. The data warehouse represents the fully articulated data model. Self service without a fully articulated data model will result in confusion. Most users of self service products are business analysts with limited technical skills and a hiatus in their understanding of data models.

    Barry Devin has pointed to a modern data (warehouse) architecture encompassing all the new data types and sources that were not as prolific as when the CIF architecture was created. In his books “The Business unIntelligence” he proposes the REAL architecture (a logical architecture!) starting from the question ‘how do people use information’ instead of all the technology driven data lake versus data warehouse points of view.

    I’m amazed not more people have picked up on this, but the again most architects are in fact IT people instead of information people. The understanding of architects of how information is used is very limited in most cases.

Top