Top Mistakes to Avoid When Building a Data Warehouse

Published in April 2001

You’ve read all of the articles, attended all of the conferences, and you’re charged up to start your own data warehousing project and attain that 401% ROI (return on investment) that
you’ve heard everyone is getting. You’ve gone to the CFO (chief financial officer) or the VP of Marketing and gained the funding for your enterprise data warehouse. Congratulations, now
it’s time for the REAL work to begin. Oh and by the way you better be successful because this project won’t be cheap and all of the discerning eyes will be upon it. You will have to
evaluate and purchase ETL (extraction, transformation, and load) software, OLAP (online analytical processing) software, all forms of hardware, middleware, and hire expensive consultants. And if
that isn’t difficult enough, remember an enterprise data warehouse evolves integrating data from a corporate perspective. Therefore you will have to interact with the most senior people in
each of your company’s divisions and guide them to consensus. And if you don’t deliver on that 401% ROI, and appease all of those division heads you’ll be organizing office
layouts in the Siberian branch. This article will address the essential questions that require good answers, BEFORE the decision support system (DSS) project can even begin.

1. What are the specific strategic business objectives (drivers) that the warehouse is suppose to achieve?

The overriding reason many decision support projects fail is not that the projects were technically unfeasible. On the contrary much of the technological challenges of data warehousing have proven
answers. The most common cause for failure is that the warehouses didn’t meet the business objectives of the organization. I have seen many companies do a fine job of technically building a
data warehouse, unfortunately the system didn’t help solve any of their business needs. Warehouses that don’t satisfy the business user’s needs are not accessed and eventually

2. What are the specific, calculable measurements that will be used to evaluate the ROI of the decision support system in meeting your company’s business objectives?

Clear business objectives are measurable. This activity is critical since once the data warehouse project is completed the management team will have to justify the expenditure. Moreover, it’s
important to understand that a data warehouse is NOT a project, it is a process. Data warehouses are organic in nature. They grow very fast and in directions you’ve never anticipated. Most
warehouses double in size and in the number of users in their first year of production. Once a cost justification can be quantified for the initial release the process for gaining funding for the
follow up releases is greatly simplified.

3. Are the key users of the data warehouse identified and committed to the success of the project?

The users ALWAYS dictate the success or failure of the warehouse. The users need to be heavily involved throughout the data warehousing project. To take it a step further the users need to have a
personnel stake in the success of the project. It’s amazing how quickly problems vanish when everyone has a vested interest in the project. Also, make it a point to educate them on the
fundamentals and processes of data warehousing. Teach them its benefits along with its limitations. This will significantly aid in managing their expectations. A good rule of thumb is if you have
gone more than two weeks without talking to your users then it’s time to setup a meeting. Keep in mind many times these people are the ones picking up the tab on these projects.

4. Is the organization trying to build a multi-terabyte, “do-all”, “be-all” data warehouse on their first iteration?

Data warehousing projects stretch an organization in ways unlike that of operational systems projects. From a political perspective an enterprise data warehouse requires consent and commitment from
all of the key departments within a corporation. In addition, the learning curve of decision support project team is seldom understood or planned for. There will be a new and dizzying array of
software tools (ETL, OLAP, portal, meta data, data cleansing, and data mining) that will require tool specific training.

By adding massive amounts of data into the equation the points of failure increase significantly. Moreover, large volumes of data will push the envelope of the RDBMS (relational database management
system), middleware, hardware, and could force developers into using parallel development techniques, if MPP (massively parallel processing) architecture is needed. Keep in mind the answer to many
of these challenges comes in the form of a hefty price tag. As a result, adding the dimension of size is just too painful and costly for most enterprises to attempt during the first iteration.

Data warehouses are best built in an iterative fashion. Do not misunderstand, this is not to recommend that a company should not build a fully functional, multiple terabyte, +7 subject area,
web-enabled, end-to-end, enterprise data warehouse with a compete meta data interface. It simply means that the highest probability for success comes from implementing a decision support system in
a phased approach. By using the first iteration as an opportunity to train the corporation it will set the stage for bigger and better future implementations.

5. Does the decision support project have support from executive management?

Any large-scale project, whether it’s a data warehouse or if you’re implementing that hot new CRM (customer relationship management) system, needs executive management onboard.
Moreover, their involvement is imperative in breaking down the barriers and the “ivory towers” in all of our companies. Their position allows them the ability to rally the various
departments within a corporation behind the project. Any substantial project lacking executive management participation has a high probability of failure.

6. Does the organization have a clear understanding of the concepts and tools involved in data warehousing?

If you do not have a data warehouse built then the answer to this question will most likely be no. As a result, training and education will be required. Keep in mind that training is required at
many levels. First, initial education is necessary to convey the concepts of what is a data warehouse, data mart, operational data store, star schema design, and meta data. Second, data acquisition
developers will probably need to be trained on a transformation tool (e.g. Informix DataStage). Third, data warehouse access developers will require significant training in an OLAP tool (e.g.
Business Objects and Cognos Powerplay). Third, data administration developers will need training on a tool that will integrate all of the company’s meta data into one repository (e.g.
Platinum Repository). Fourth, more than likely there will be a web component used to access the data warehouse and the meta data repository. Depending on your organization, additional training and
outside consulting could be needed for each of these areas. Keep in mind that these are only the data warehousing specific training issues. There still needs to be an understanding of the hardware,
middleware, desktop, RDBMS, and coding language (COBOL, C++, etc) of the transformation tool.

7. Is there a highly experienced project manager and data warehouse architect that have experience building warehouses that will actively participate throughout the project?

Data warehousing projects are fundamentally different from operational projects. Operational projects are necessary in order to operate the day-to-day business of the company. Decision support
projects are critical for making strategic decisions about your organization. In addition, data warehouses grow at an alarming rate during the first few years of production. An experienced data
warehouse project leader understands these facts and keeps the vision of the project in concert with the real-world reality of decision support. In addition, the data warehouse architect must
design a scalable, robust, and maintainable architecture that can accommodate the expanding and changing decision support requirements.

These fundamental challenges require highly experienced, senior level individuals. These positions can be filled via in-house resources or by consultants. If consultants are used to fill these
roles it is imperative that the consultants are highly skilled at knowledge transfer and that in-house employees have be assigned to shadow the consultants for both of these roles.

8. Has an experienced consultant been brought in to do a readiness assessment of the organization?

This step is very important since an experienced hand can identify problem areas in the organization that can be dealt with early in the decision support system project’s life cycle. Now
identifying that person is another issue. Be wary of consultants without real-world, hands on experience. It’s one thing to be able to write or speak about data warehousing; it’s
entirely something else to have the experience needed to navigate through the political quagmires and the knowledge of what it takes to physically build a data warehouse.

If you answered “No” or “I’m not sure” to any of these questions then you’ll need to discover the answers before you head down the data warehousing trail.
Without those answers you might need to pack up your snow shoes and thermal underwear because your supervisor will be ordering you that one-way ticket on Siberian Airlines.


submit to reddit

About David Marco

Mr. Marco is an internationally recognized expert in the fields of enterprise architecture, data warehousing and business intelligence, and is the world’s foremost authority on metadata.  Mr. Marco is the author of several widely acclaimed books including “Universal Meta Data Models” and “Building and Managing the Meta Data Repository: A Full Life-Cycle Guide”.  Mr. Marco has taught at the University of Chicago, DePaul University, and in 2004 he was selected to the prestigious Crain’s Chicago Business "Top 40 Under 40". He is the founder and President of EWSolutions, a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class business intelligence solutions using data warehousing, enterprise architecture and managed metadata environment technologies (