Independent Data Marts – Part 1

Published in TDAN.com October 2000

Articles in this series – Part 1, Part 2

There is a severe disease that has spread to epidemic proportions throughout our society. This disease is particularly dangerous as it effects are not readily identifiable at the time of infection.
However if this condition goes untreated it can be debilitating and even terminal. This disease is not hepatitis, but rather “independent” data marts. While this imagery may seem a bit
dramatic, unfortunately it reflects the reality in many of today’s companies.

This article is the first of a two part series on migrating from independent data marts to an architected solution. This installment will address the characteristics of independent data marts, the
flaws in their architecture, and the reasons why they exist. Part two will run in the next issue of TDAN and will address specifically how does a company migrate off of the independent data mart
architecture to an architected solution.

Characteristics of Independent Data Marts

Independent data marts are characterized by several traits. First, each data mart is sourced directly from the operational systems without the structure of a data warehouse to supply the
architecture necessary to sustain and grow the data marts. Second, these data marts are typically built independently from one another by autonomous teams. Typically, these teams will usually
utilize varying tools, software, hardware, and processes.

Possibly the most visually descriptive trait of a company that has constructed independent data marts is that once they map out a schema of there decision support systems (DSS) is that the schema
will resemble that of a “spaghetti” chart (See Figure I).* What is most disturbing is the number of companies that have expressed that this chart resembles their current DSS
architecture.

 


IndependentDataMartArchitecture

Figure I: Independent Data Mart Architecture

As we see this architecture is not an architecture at all. Instead it is a series of “stovepipe” DSS systems. This architecture greatly differs from that of an architected data
warehouse (See Figure II).

The purpose of this article is to discuss independent data marts and the process for migrating off of them to an architected solution, however we will briefly touch on the topic of DSS
architecture. We will not go into a detailed discussion of top-down vs. bottom-up approachs (we will save that topic for a future article), except to say that the “classic” top-down
approach is a more scalable, and logical approach for constructing a DSS system. It is surprising how often the top-down methodology is mistaken for a “galactic” approach. This is a
misunderstanding as the top-down approach is best used iteratively and incrementally to build the DSS system. When used in this fashion the cost for building a data warehouse that feeds
“dependent” data marts becomes highly comparable to the cost of building independent data marts.

 


IndependentDataMartArchitecture

Figure II: Architected Decision Support System


Problems With Independent Data Marts

Redundant Data

As the number of independent data marts grow, the amount of redundant data begins to grow uncontrollably across the enterprise. This redundancy occurs because each of the independent data marts
requires its own, typically duplicated copy of the detailed corporate data. Often a great deal of this detailed data is not required in the data marts, which typically provide summarized views.

It would be enlightening if a study were conducted to calculate the costs of maintaining non-necessary redundant data for Fortune 1000 companies. The end total would be in the billions of dollars
in expenses and lost opportunity.

Redundant Processing

A data warehouse provides the architecture to centralize integration and cleansing activities common to all of the data marts of a company. Without the data warehouse all of these integration and
cleansing processes need to be duplicated for all of the independent data marts. This greatly increases the number of support staff required to maintain the DSS system, creating a particularly
disastrous situation for most companies in light of today’s IT staffing shortage.

Separate teams will typically build each of the independent data marts in isolation of one another. As a result, these teams do not leverage the other’s standards, processes, knowledge, and
lessons learned. This results in a great deal of rework and reanalysis.

These autonomous teams will commonly select differing tools, software, and hardware. This forces the enterprise to retain skilled employees to support each of these technologies. In addition, a
great deal of financial savings is lost, as standardization on these tools doesn’t occur. Often a software, hardware, or tool contract can be negotiated to provide considerable discounts for
enterprise licenses, which can be phased into. These economies of scale can provide tremendous cost savings to the organization.

Scalability

Independent data marts directly read operational system files and/or tables, which greatly limits the DSS system’s ability to scale. For example, if a company has five independent data marts
it is likely that each data mart would require customer information. Therefore, there would be five separate extracts being pulled off of the same customer tables in the operational system of
record. Most operational systems have limited batch windows and can not support this number extracts. With a data warehouse only one extract is required in the operational system of record.

Non Integrated

As previously discussed each independent data mart is built by autonomous teams, typically working for separate departments. As a result, these data marts are not integrated and none of them
contain an enterprise view of the corporation. Therefore, if the CEO asks the IT department to provide him with a “listing of our most profitable customers” each data mart will offer a
different answer. Having worked with companies that have experienced this exact situation I can attest that the CIO is rarely pleased to have to explain why his department cannot answer this
seemingly simple question.

One of the chief phenomena facing corporations today is the current merger and acquisition craze. Interestingly enough one of the key factors fueling this movement are these companies desire to
reduce their IT spending. In light of this situation the costs associated with independent data marts becomes even more magnified as companies continue to focus on controlling their ever growing IT
costs.

It is important to note that many companies that have built independent data marts are currently in the process of migrating off of them. Needless to say the cost, in dollars and time for the
migration is not trivial.

Why Do Independent Data Marts Exist?

With all of these architectural flaws it would seem surprising that so many companies have built their DSS systems around this architecture. There are several reasons why this aberration has
occurred.

DSS Are Complex

When the decision support craze spread, most companies were looking to build a data warehouse of their own. Unfortunately, the task of building a well architected and scalable business intelligence
system is complicated and requires sophisticated software, expensive hardware, and a highly skilled and experienced team. Finding data warehouse architects and project leaders that truly understand
data warehouse architecture is a daunting challenge, both in the corporate and consulting ranks.

In order to construct a data warehouse a corporation must truly come to terms with their data and the business procedures that the data represent. While this task is challenging it is a necessary
step and one in which the true value of the DSS process is derived from.

Independent Data Mart Shortcut

Building independent data marts are less expensive than architected decision support systems. In addition, independent data marts can be constructed fairly quickly and do not require a company to
really understand their data beyond that of individual departments as a data warehouse requires. These points have been effectively used to sell the concept of constructing independent data marts.
Unfortunately, it is this lack of thorough analysis and long-term planning that limits the independent data marts from being an effective business intelligence system.

Inappropriate Vendor Messages

Many vendors have developed tools that are effective at building small departmental independent data marts. These companies in their rush to market with these tools have worked very hard at selling
the independent data mart concept (of course it is never worded like this). The reasons are obvious. These companies can significantly reduce their sales cycles because only one department is
involved in the software purchasing decision. In addition, their software requires much less sophistication because they merely need to build a standalone data store.

The current vendor buzzword in today’s market is “turnkey”. Everyone seems to offer a “turnkey” DSS solution. Unfortunately, merely purchasing a “turnkey”
solution does not alleviate the task of learning and understanding a corporation’s data and their business processes. Integration of data from disparate systems requires a careful analysis
and an understanding of business processes and the data that represents them. There isn’t a “magic bullet” or “turnkey” solution that alleviates this task.

The second part of this two part series will take an in-depth look at how to migrate off of this flawed architecture. In that article we will present the two approaches for migrating from
independent data marts, identify necessary initial corporate decisions, methods for identifying the migration path to the architected solution, and we will walk through an independent data mart
migration case study.

* It is important to note that this chart is an actual client’s DSS architecture schematic. I’m proud to say that they are no longer on this architecture.

Share

submit to reddit

About David Marco

Mr. Marco is an internationally recognized expert in the fields of enterprise architecture, data warehousing and business intelligence, and is the world’s foremost authority on metadata.  Mr. Marco is the author of several widely acclaimed books including “Universal Meta Data Models” and “Building and Managing the Meta Data Repository: A Full Life-Cycle Guide”.  Mr. Marco has taught at the University of Chicago, DePaul University, and in 2004 he was selected to the prestigious Crain’s Chicago Business "Top 40 Under 40". He is the founder and President of EWSolutions, a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class business intelligence solutions using data warehousing, enterprise architecture and managed metadata environment technologies (www.EWSolutions.com).

Top