A large athletic shoe manufacturer has the slogan “Just do it!” While this may be appropriate for running and jumping, it’s the death knell for data warehouse projects. Data warehouse
projects that have started without a project plan are prone to being short on required resources, they are late, over budget, the deliverables are of poor quality and, most importantly, they do not
give the users what they were expecting.
While project management is critical for operational projects, it is especially critical for a data warehouse project, as there is little expertise in this rapidly growing discipline. The data
warehouse project manager must embrace new tasks and deliverables, develop a different working relationship with the users and work in an environment that is far less defined than traditional
Data warehouse project management deals with Project Plans, Scope Agreements, Resources, Schedules, Change Control, Managing Risk, Communication and Project Management Tools and Methodology.
A good project plan lists the critical tasks that must be performed, when each task should be started and completed. It identifies who is to perform the tasks, describes deliverables to be created
and identifies milestones for measuring progress.
There is some overlap in project management between a data warehouse project and an operational project but many tasks and deliverables are specific and unique to data warehouse. MapXpert for Data
Warehouse™ identifies some of these unique tasks:
- Define data warehouse objectives
- Define Query Library/Location Matrix
- Define Query Libraries
- Integrate meta data
- Develop a refresh strategy
- Define data warehouse product selection criteria
- Prepare migration specifications
- Develop query usage monitoring
The scope agreement is the primary document of understanding between IT and the user. It specifies what functions and data will be delivered and excludes what will not be delivered. It has dates,
responsibilities – both IT and the user – end user tools to be provided, end-user training and who will be picking up the tab for the project. The scope agreement – or lack
thereof – can be the primary reason for user dissatisfaction. Without a scope agreement the user usually expects more than IT is planning to deliver. Without a scope agreement, there is no
basis for determining if the project was a success or a failure.
A wise project manager counseled those new in the field to read the scope agreement every Friday afternoon and to make sure that the agreement will be met. It may make sense to deliver some
capability over and above what is specified in the document but if you don’t deliver on the scope commitments, you have failed as a project manager.
The scope agreement should be reviewed periodically with the user to let them know that you are working toward the specific deliverables itemized in the document.
We often hear the lament that the users are unresponsive to requests for the time and sign-offs. Since their responsibilities are included in the scope agreement, it should be easier to get their
attention when you point out their commitments.
The most challenging job for the data warehouse project manager is recruiting internal skilled personnel. The competent workers are almost always very heavily involved with other projects. Pulling
them into the data warehouse world takes a very creative and well-connected project manager who can sell both management and the targeted worker.
It may be possible to recruit an employee skilled in certain aspects of the data warehouse. A manager recruiting and new to data warehouse may be easily snowed by the appropriate buzz words and
impressed with knowledge in some specific area of data warehouse. Experience in writing queries does not equate to skills in data warehouse database design. A data warehouse-experienced consultant
could aid in evaluating the candidate.
While it may be expedient to engage consultants for advice and direction and to engage contractors for specific tasks, the organization should either have the skills in house or plan to develop
those skills. You may choose to bring in a hired gun, but do you want them around after they have cleaned up the town.
A key resource is budget. Without an adequate budget, the project manager has no hope of hiring good people, choosing the best products and tools, acquiring the right hardware and hiring
consultants and contractors. What should a data warehouse cost? The cost will be a factor of the following:
- The size of the database. (Keep in mind capacity planners must include space for indexes, summaries and working space.)
- The complexity and cleanliness of the data
- The number of source databases and their characteristics
- The number of users
- The choice of tools
- The network requirements
- The delta between the required and the available skills
A final note on budgets. It’s impossible to develop an accurate budget without a good plan as its base.
Data warehouse schedules are usually set before a project plan has been developed. It’s the project plan that has durations, assignments, predecessor tasks and is the only means of determining how
long a project will take. Without such a plan, assigning a delivery date is wishful thinking at best.
An unrealistic schedule pushes the team into a mode of taking shortcuts that ultimately impact the quality of the deliverables. A good project plan is a powerful tool for resisting unreasonable
What about those who say, “We need a stake in the ground. Otherwise the project will go on indefinitely.” This approach does not give the project manager any credit for developing a scope
agreement and producing a project plan. If imposing a delivery date is the only way to get the project manager to deliver, you have the wrong project manager.
The user would like the data warehouse delivered very fast. In fact, they have told by countless vendors that a data warehouse can be delivered in an unrealistically short time. What the vendors
fail to include are:
- Understanding and documenting the data – they assume the data is well understood and documented
- Cleaning the data – they assume the data is clean
- Integrating data from multiple sources – they assume one source
- Dealing with performance problems – they assume performance can be dealt with at a later time
- Training internal people so they can enhance and maintain the data warehouse
What gets delivered in short periods of time are small warehouses that are not Industrial strength, not robust enough to be enhanced and of not much use to anyone.
The data warehouse lends itself nicely to phasing. This means that increments can be developed and delivered in pieces without the traditional difficulty associated with phasing an operational
system. Each phase needs its own schedule but there should be an overall schedule that incorporates each of the phases.
The nature of the data warehouse is such that new requests will constantly be added. These requests may come in the form of wanting new data from current source files, new canned reports, a new
delivery vehicle (web), the integration of new source files or access by more users.
Most project managers include a small amount of fat in their estimates to be able to accommodate unforeseen contingencies and small scope changes. Sometimes, the small changes begin to creep up to
the point where the existing schedule can not longer handle the additional requirements. This is scope creep. When the requests are for large changes, new and complex source data or some new
technology implementation, it is referred to as scope gallop.
If the schedule derived from the project plan was realistic, a major change cannot be accommodated with an existing schedule. The project variables are:
- Function – what capability will be delivered
- Resources – this includes budget, skilled personnel and management commitment
- Quality – An unrealistic schedule may push the team to take shortcuts such as minimal testing, incomplete documentation and inadequate training.
- Team health – Some managers believe they can drive their team 12 hours each day and seven days each week. This can work for only a short time. After that, the team may be present but
their productivity and work quality suffer greatly.
Every project will have a degree of risk. The goal of the project manager is to recognize and identify the impending risks and to take steps to mitigate those risks. Since the data warehouse may be
new, the risks may not be as apparent as in operational systems.
The loss of a sponsor is an ever-present risk. It can be mitigated by having at least one backup sponsor identified who is interested in the project and would be willing to provide the staffing,
budget and management drive to keep the project on track. The risk can also be lessened by keeping user and IT management informed of the progress of the project along with reminding them of the
The user may decline to use the system. This problem can be overcome by having the user involved from the beginning and involved with every step of the implementation process including source data
selection, data validation, query tool selection and user training.
The system may have poor performance. Good database design with an understanding of how the query tools access the database can help hold performance in line. Active monitoring can provide the
clues to what is going wrong. Trained DBAs must be in place to first monitor and then take corrective action. Well- tested canned queries made available to the users should minimize the chances of
them writing “The Query that Ate Cleveland.” Training should include a module on performance and how to avoid problem queries.
Probably the most neglected aspect of a data warehouse project is keeping everyone informed. This includes the project team as well as the sponsors and end user representatives. People’s memories
are short. Periodic formal and informal communications should be an integral part of every data warehouse project plan, especially large data warehouse projects.
The project plan should include monthly presentations to sponsors and end user representatives. Each presentation should include:
- A review of the scope and deliverables of the project
- A “we are here” on the project time-line
- A discussion of any issues that have been difficult to resolve; frequently, one of the attendees can help cut through these tough obstacles
- A frank discussion of any activities that are behind schedule or have problems
- A review of the coming month’s activities and priorities
- Any contingency plans to make up time and address problems, including additional resources or schedule relief, if needed
- An open question and answer period
- A summary and conclusions
Unfortunately, not everyone with an interest in the project can attend the formal presentations. A semi-monthly newsletter – e-mail works well — containing highlights can be circulated to
Data warehouse projects are generally very expensive, involve a lot of people over a long term and include many activities that are not familiar to sponsors, end users or developers. This being the
case, data warehouse projects generally need all of the good will they can get. A well executed communications plan can generate or preserve this goodwill and help keep harmful rumors and bad press
to a minimum.
We have all seen disingenuous reports that gloss over problems and paint an incomplete, very glossy and overoptimistic version of the truth. It’s imperative that communications be honest.
Dishonest notices will result in a staff becoming cynical; they will start to question the truthfulness of any project communication.
Project Management Tools and Methodology
While project management methodologies and software tools are no substitute for good project management, there are a few that can set an organization on the right track and some have project
planning capabilities specific to data warehouse. The methodologies developed for operational systems are not appropriate for data warehouse. Some of the large consulting organizations have data
warehouse methodologies and project planning but they are typically not sold without their consultants. The following software products and methodologies are specific to data warehouse:
- Hadden-Kelly Data Warehouse Method, Manhattan Beach, California
- Xpert (MapXpert for Data Warehouse), Newport Beach, California, www.planxpert.com
- Sterling Software (VISION:Expert), Woodland Hills, California.
- Ziga (CEM), Basking Ridge, New Jersey.
Good project management can often make the difference between a successful and a disastrous data warehouse implementation. It’s naïve to believe that since data warehouses are different than
operational systems, we don’t have to concern ourselves with project plans, schedules, resources, risk, scope agreements and change control. The projects that succeed are those that will have good