EII – Dead on Arrival

The computer industry loves its buzzwords, and one that has cropped up in recent years is “enterprise information integration”. The idea of this is that everyone knows that companies have their
data locked up in multiple, incompatible IT systems: ERP, CRM, supply chain, etc. At present, the only way to make sense of it is to extract data from these systems, try and resolve inconsistencies
and data quality issues, and then load the result into a data warehouse, from which you can report on the data in a common form. Unfortunately this approach is hard: you discover that the data
quality in even the shiniest new ERP systems is not what it might be, you have to unravel the differences between the way that various business units classify products, channels and customers, then
you have to design and build a data warehouse, and the subset “data marts” from which you can report using one of the many well-established reporting tools around (such as BusinessObjects).

EII vendor’s technology has genuine application in trying to answer questions like “give me a view of the all the data we have on customer x”, which involve access to current data, what some
term “lightweight BI”. However they have recently been peddling their products for more general business intelligence applications. After all, why go to all the trouble of building a data
warehouse when someone can come along with a technological magic wand? Vendors with “EII” solutions have whitepapers that scorn today’s approach to business intelligence, promising that their
technology can merely look at all those inconsistent source systems and somehow run queries that will give the answers without having to go through all that dull work of building and populating a
data warehouse.

Well – that’s it then: what were we all thinking? The data was there all the time in the source systems, and for over a decade people have unaccountably been copying it somewhere else in order to
report on it; what a bunch of dopes they were! How much simpler just to access the data directly in real-time from the sources: how very “real time enterprise”.

Some people who should know better have swallowed this EII mirage hook, line and sinker, and a number of start-up companies have been funded flaunting “EII for business intelligence” messages.
The only problem with this new futurist approach is that it is absolutely and utterly flawed.

Let’s consider the problem again. You have data in dozens of incompatibly structured source systems. Your new EII software is somehow going to build a presumably fairly complex set of distributed
queries that will zip off to the source transaction systems, interrogate them and bring back a result set that will somehow produce a consistent answer.

The first problem is: how exactly does the EII software know what the linkages are between the differently coded source system structures? Somewhere it is going to have a catalogue which will
translate the differences, rather like a dictionary to translate words from one language to another. This sounds suspiciously like a metadata dictionary of the type that data warehouses have to
construct, but let’s leave that aside for the moment.

What exactly happens when those distributed queries make their way through to the source systems? For a start, the unpredictable nature of queries will upset the careful load balancing done by
operations departments to optimize on-line throughput. Or rather, it won’t, because no systems managers are going to allow this technology anywhere near their delicately balanced systems, at least
not after the first time it brings the ordering system to a grinding halt.

The next problem with the EII approach is that there is no history. For transaction systems, you want to archive data quickly in order to maintain high performance (there is no need to worry about
what your account balance was last year, just what it is now; last year’s balance can be archived). However for an inquiry like “show me the trend in account withdrawals over the last year in the
south-east region,” this does require historical data.

Next, do these vendors really think that all the analysis hierarchies needed are embedded within the ERP systems? To take the example of marketing, there are normally complex segmentation
hierarchies for analysis purposes that are usually held in entirely separate places from the core transaction systems, and are not stored along with each order or invoice.

Just as importantly, the EII tools entirely ignore the tedious problem of data quality. It may be news to vendors who have more experience producing PowerPoint slides than production code, but the
quality of data lurking in the transaction systems is not what it might be. This is why there is an industry of products to assist with improving data quality, and why a significant chunk of any
data warehouse project budget is associated with data quality. Oh that’s right; you don’t need a data warehouse any more, so I guess you may as well ignore that pesky data quality problem as

Finally, what happens if there is actually a change in the structure of the transaction system, e.g. a change to the general ledger structure, or the way in which the back accounts are grouped,
perhaps following a reorganization? Disappointment using the EII approach, since no history of the hierarchies in place, at that time, is kept. To be fair, this problem can also challenge
conventional data warehouse approaches, but at least it can be tackled, albeit with difficulty.

So, with EII for business intelligence, you can’t deal with business change at all, data quality is AWOL, you can’t look at trends, you are likely to dim the lights in the computer room and cause
the key operational systems of the company to come to a grinding halt. Other than that, it is a great idea.

Next time someone tries to sell you some software that appears to be a bit too close to sleight of hand, check very carefully the customer references of people actually using the software in this
way. According to a leading industry analyst, only two EII vendors can give any decent customer references at all. The software industry has years of practice of writing convincingly argued
whitepapers that spin a compelling case, yet only when customers hand over hard cash do they seriously invest in the development to make it work. Always remember the caution used in the wonderful
film the Princess Bride: “Life is pain, and anyone who tells you different is trying to sell you something.”


submit to reddit

About Andy Hayler

Andy Hayler is one of the world’s foremost experts on master data management. Andy started his career with Esso as a database administrator and, among other things, invented a “decompiler” for ADF, enabling a dramatic improvement in support efforts in this area.  He became the youngest ever IT manager for Esso Exploration before moving to Shell. As Technology Planning Manager of Shell UK he conducted strategy studies that resulted in significant savings for the company.  Andy then became Principal Technology Consultant for Shell international, engaging in significant software evaluation and procurement projects at the enterprise level.  He then set up a global information management consultancy business which he grew from scratch to 300 staff. Andy was architect of a global master data and data warehouse project for Shell downstream which attained USD 140M of annual business benefits. 

Andy founded Kalido, which under his leadership was the fastest growing business intelligence vendor in the world in 2001.  Andy was the only European named in Red Herring’s “Top 10 Innovators of 2002”.  Kalido was a pioneer in modern data warehousing and master data management.

He is now founder and CEO of The Information Difference, a boutique analyst and market research firm, advising corporations, venture capital firms and software companies.   He is a regular keynote speaker at international conferences on master data management, data governance and data quality. He is also a respected restaurant critic and author (www.andyhayler.com).  Andy has an award-winning blog www.andyonsoftware.com.  He can be contacted at Andy.hayler@informationdifference.com.