Published in TDAN.com January 2001
Independent data marts have spread like a disease through many of today’s best and most advanced corporations. The devastating nature of this disease is that it is not easily detected in its
initial stages, however if it is not treated that patient’s condition will steadily deteriorate.
This article is the second and concluding portion of a two part series on migrating from independent data marts. In part one (October 2000- TDAN.com) we examined the characteristics of independent
data marts, the flaws in their architecture, and the reasons why they exist. This installment shall focus on the approaches for migration, initial planning, how to identify a migration path and
present case study illustrating the how a corporation can migrate from independent data marts to an architected solution.
Approaches to Migration
There are two general approaches for migration; “Big Bang” and “Iterative”. Table 1 summarizes the advantages and disadvantages of each approach.
Approach |
Advantages |
Disadvantages |
Bottom Line |
“Big Bang” |
|
|
Best used when the independent data mart problem is not very pervasive. |
Iterative |
|
|
This approach is best used when the independent data mart problem is large and complex. |
Big Bang Approach
As the name implies all of the independent data marts will be reengineered simultaneously into a structured DSS architecture. There are a couple of advantages to this approach. First, it can
provide the fastest path for migration. Often companies will need to change their DSS architecture as quickly as possible because of a need to implement additional DSS projects that promise to lend
a high ROI or because there are currently funds available for the effort that might not be available at a later date. Second, this approach allows for immediate economies of scale rather than
slowly attaining them in Iterative method. The disadvantages to this approach is that it is labor intensive and requires tremendous coordination. In addition, the “Big Bang” approach is
the more complexed of the two to implement and thus provides the highest exposure.
This approach is best suited when the independent data mart problem is relatively small and not highly complexed. However, when the problem is large the complexity of the migration grows at a
tremendous rate.
Iterative Approach
This approach looks to reengineer the independent data marts (one or two data marts at a time) in manageable phases. The advantages to this approach are several. First, it allows a company to
manage and reduce the risk involved in a migration effort. This occurs because the migration can be accomplished in a phased manner thereby increasing the probability of the project’s
success. Second, as each project phase is executed lessons are learned and leveraged for subsequent phases. This is very valuable as typically once the first phase is completed the follow up phases
run much more smoothly.
The major disadvantage to this approach is that it takes longer to fully complete the migration. This approach is best used when the independent data mart problem is large and too complexed to
tackle in a “Big Bang” manner.
Initial Planning
Many companies fail in their migration efforts well before they start. The chief reason for this is the lack of initial planning and sponsorship. Attaining executive sponsorship is one of the most
important tasks at the onset of the project. This is critical as typically each of the independent data marts have been constructed by autonomous teams in different corporate departments. Therefore
having a project champion that has cross-departmental authority is critical for dealing with the political challenges, which are commonplace in these migration efforts.
During the initial planning phases it is important to plan on implementing a meta data repository that can support future DSS development efforts and that will provide a semantic layer between the
business users and the DSS system. The data mart migration provides an outstanding opportunity to implement the meta data repository. Before the data mart migration begins it is best to standardize
the data naming nomenclature for the DSS system. By implementing standard data naming nomenclature it will aid in the DSS system’s maintenance and provide cleaner and more understandable meta
data.
A great deal of research needs to be conducted on the independent data marts before a migration is possible (Table 2: summarizes these tasks). The most important research activity is to understand
the business needs that each independent data mart is meeting. Typically multiple independent data marts will exist to meet the same or similar business needs. These situations are common and do
suggest a path for migration. The results of this research will illustrate the independent data marts that will be the most difficult to migrate.
During independent data mart migration it is an excellent time to standardize on hardware, and software for the DSS project (Table 3: Hardware/Software Classification). For each differing software
or hardware platform a company needs to have trained personnel to support it. Therefore, by limiting the redundant software/hardware the corporation reduces the support strain on their IT staff. In
addition, standardizing allows for software and hardware purchasing economies of scale can be achieved.
Independent Data Mart Research |
Amount of data (raw and total) in each data mart |
Data refresh/update criteria |
Archive criteria |
Understand the business users requirements of the data mart |
Identify transformation rules |
Hardware/Software Classification |
Hardware (Unix, Mainframe, AS400) |
Hardware Architecture (SMP, MPP, NUMA) |
Desktop Computers |
Notebook Computers |
Database (Oracle, SQL Server, DB2) |
ETL Tool (Ardent, Prism) |
Meta Data Integration Tool (Microsoft Repository, Platinum Repository, Viasoft Rochade) |
OLAP Access Tool (Business Objects, Cognos, Microstrategy) |
Data Quality Tool (Prism Quality Mgr., id Centric, Trillium, Vality) |
CASE Tool (Erwin, PowerDesigner, Cayenne) |
Golden Rule
The central covenant of any independent data mart migration effort is to “Never delivery less functionality to the business users than what they have today”. Generally business users do
not react well to spending money on infrastructure because they don’t initially see its value. The key business users need to be educated that a bad system architecture leads to a
non-scalable and non-flexible system that will eventually need to be rewritten at a very high cost. Therefore, during migration the users must be assured that they will not receive less
functionality (information, ease of use, and response time) than what they are currently receiving today.
Identifying a Migration Path
There are several activities that are necessary to conduct before a migration path will be evident.
Create Your Own Spaghetti Chart
First, diagram out the current DSS architecture. This is critical for identifying which legacy systems are feeding which independent data marts.
Identify Redundant Data
Often independent data marts will be sourced from the same legacy systems. By targeting independent data marts with the same source data often multiple independent data marts can be removed with
minimal extra effort. Identifying redundant data often suggests a migration path.
Figure 2 illustrates existing independent data marts for a company. In the schematic both the Finance and Marketing data marts are being sourced from the same legacy systems. This suggests that it
might be wise to target both of these data marts for initial migration (assuming the Iterative approach is being used).
Identify Paths of Least Resistance
Data
It is important to target those independent data marts whose data will most likely be used in future DSS efforts. By targeting these data marts first it will ease the task of keeping all new DSS
development activity in the new architected environment.
The next step is to identify those data marts whose transformation rules are known and documented. Understand that even the best documented transformation rules will have gaps. Moreover, even those
marts that have been built using ETL (Extraction/Transformation/Load) tools have meta data (documentation) gaps. For example, ETL tools many times provide the functionality to call user exits that
are hand-coded programs. The processes performed by these user exits will not be captured in the ETL tool’s meta data stores. If documentation does not exist for a mart then programmers will
need to manually analyze each of the ETL program’s code to extract the transformation rules. Manually analyzing code to extract transformation rules is a very time consuming and expensive
activity.
Political
It will be critical to obtain support from the current independent data mart IT teams and business users. Identify those data mart teams most likely to work cooperatively with the centralized DSS
team. Recognize the strengths and weaknesses of those teams that can and will provide the most aid. If a particular data mart team/business users are not willing to assist with the migration effort
it is best to work around these teams by delaying the migration of their particular data mart. If this is not an option then utilize your executive sponsorship to “motivate” this group
to provide their support.
Understand your team’s strengths and weaknesses.
Keep in mind that any team will have its stronger and weaker areas of knowledge. As much as possible keep your team’s areas of weakness off of the critical path. Any mission critical team
weaknesses need to be shored up with internal members from the other data mart teams or from outside vendors.
Case Study: Putting the Concepts into Motion
The following case study looks to put the concepts we’ve discussed into action. This case study illustrates the iterative approach to independent data mart migration as most companies that
have independent data marts typically have a pervasive and complex situation.
Background
The XYZ company is a Fortune 500 consumer electronics firm. XYZ recently acquired a smaller company (Acme Electronics) that has a single Marketing data mart which little is known about. In
addition, XYZ is standardizing on a new order entry system in 5 years and existing batch windows for the legacy systems have reached it’s limit. XYZ’s management team is stable. well
organized, and fully supports the migration effort. Table 4 lists out the DSS specific details and Figure 4 shows the current DSS architecture.
DSS Background |
Currently have 50 – 60 business users accessing the DSS systems |
Numerous potential business users are requesting access |
Current user want many new enhancements to the existing DSS systems |
Each data mart utilizes different tools, software, and hardware |
Marketing and Finance data marts have cooperative IT and business users |
Quality Control data marts has less knowledgeable IT staff and uncooperative business users |
Little to nothing is known about the Acme Marketing data mart |
Phase One Migration
By viewing the data it is evident that the Marketing and Finance data marts share two common data sources (old and new order entry systems). In addition, the Marketing data mart has a strong
end-user community that will be highly supportive of the migration effort. In addition, both the Marketing and Finance data mart’s business users have agreed to freeze their additional
functionality requests for Phase One of the migration.
During this phase we avoided migrating the Quality Control and the Acme Marketing data marts. This occurred because of the lack of support in the Quality Control mart and all the unknowns of the
Acme Marketing mart. Figure 4 illustrates the Phase One DSS architecture.
Phase Two Migration
During this phase the operational logistical system’s data will be brought into the data warehouse and the Quality Control data mart is now being sourced directly from the enterprise data
warehouse. In addition, during this phase the Marketing and Finance teams change requests that were frozen during Phase One implementation are now being developed. Lastly, a new dependent
Accounting data mart is now being sourced from the data warehouse
Phase Three Migration
In this phase we are merging the functionality in the former Acme Electronics Marketing data mart into the existing dependent Marketing data mart. Also, additional data marts are continuing to
appear (CEO data mart).
It is important to understand that the process for migrating off of this architecture is a costly proposition, that will only gets more expensive and difficult as time goes on. Remember, as with
any disease the earlier it is detected and treatment begins the sooner the patient will become healthy. However if treatment is delayed the patient’s condition will worsen and eventually
become terminal.