Through the years, many architectures have been introduced for developing business intelligence systems, including such classic architectures as Ralph Kimball’s Data Warehouse Bus Architecture, Bill Inmon’s Corporate Information Factory, and his more recent architecture called Data Warehouse 2.0.
These architectures have been deployed by most organizations and have served them well the last fifteen to twenty years. For a long time we had good reasons to use these architectures, because the state of database, ETL, reporting, and analytical technology did not truly allow us to develop systems based on other architectures. In addition, most tools were aimed at supporting these classic architectures, which made developing systems with other architectures hard.
The question we have to ask ourselves is: Are these still the right architectures? Are these still the best possible architectures we can come up, especially if we consider the new demands and requirements of users, and if we look at new technologies available in the market, such as analytical database servers, data warehouse appliances, and in-memory analytical tools? My answer would be no! To me, we are slowly reaching the end of an era. An era where the classic architectures were dominant. It’s time for change. This article describes an alternative architecture, one that is more flexible and fits the needs and demands of most organizations for (hopefully) the next twenty years.
This new architecture is called the Data Delivery Platform (DDP) and was introduced in a number of articles published at BeyeNETWORK.com (see The Definition of the Data Delivery Platform . The definition of the DDP is:
Fundamental to the DDP are two principles. The first one is decoupling of data consumers and data stores and the second is shared specifications. Let’s explain those two principles.
In a business intelligence system with a DDP-based architecture, data consumers are decoupled from the data stores by a software layer. This means that data consumers (such as reports developed with SAP BusinessObjects WebIntelligence, SAS Analytics, JasperReport, or Excel) don’t know which data stores are being accessed: a data warehouse, a data mart, or an operational data store. Nor do they know which data store technologies are being accessed (an Oracle or DB2 database, or maybe Microsoft Analysis Service). The data consumers will only see and access the software layer, which presents all the data stores as logically one big database; see Figure 1. The data consumers have become data store independent.
The advantages resulting from decoupling are:
Decoupling data consumers from data stores is based on the concept of information hiding. This concept was introduced by David L. Parnas (see ‘Software Fundamentals, Collected Papers by David L. Parnas’, Addison-Wesley Professional, 2001) in the ’70s and was adopted soon after by object-oriented programming languages, component-based development, and service oriented architectures. But until now, the concept of information hiding has only received limited interest in the world of data warehousing.
The second principle of the DDP is called shareable specifications. Most reporting and analytical tools require specifications to be entered before reports can be developed. Some of those specifications are descriptive and others are transformative. Examples of descriptive specifications are definitions of concepts; for example, a customer is someone who has bought at least one product, and the Northern region doesn’t include the state Washington. But defining alternative names for tables and columns, and defining relationships between tables are also descriptive specifications. Examples of transformative specifications are ‘how should country codes be replaced by country names’, and ‘how a set of tables should be transformed to one cube’. In the DDP those specifications are centrally managed and are shareable. The advantages resulting from shared specifications are:
Currently, the simplest way to develop a DDP-based business intelligence system is by using a federation server. There are many federation servers available on the market, including Composite Information Server, Denodo Platform, IBM InfoSphere Federation Server, Informatica Data Services, Oracle BI Server, and RedHat MetaMatrix Enterprise Data Services. As example, the article Using Composite Information Server to Develop a DDP describes how to develop a DDP with Composite’s product.
To summarize, the Data Delivery Platform is a business intelligence architecture that offers many practical advantages for developing business intelligence systems, including increased flexibility of the architecture, shareable transformation and reporting specifications, easy migration to other data store technologies, cost reduction due to simplification of the architecture, easy adoption of new technology, and transparent archiving of data. The DDP can co-exist with other more well-known architectures, such as the Data Warehouse Bus Architecture, and the Corporate Information Factory.