Are You The Master of Your Data, Or its Slave?

Published in January 2006

The emerging field of “master data management” is simply how to deal with data that needs to be shared between different computer systems, data such as products or customers. So, four years after
the big rationalization of IT systems in the late 1990’s, how many different computer systems does a large company have? One subsidiary of a large energy company, after putting in every module of
SAP, still had 175 interfaces to other systems. A CIO of a large multinational admitted to having 650 different installed enterprise applications at a recent conference.

According to a survey by Tower Group, companies maintain master data separately in at least 11 or more source systems. The picture is actually even more complex than this, since large companies
rarely have just one instance of an Enterprise Resource Planning (ERP) or Customer Relationship Management (CRM) system. It is not unusual for global companies to have 50 or more separate ERP
instances, each of which will have a slightly different implementation from the next. So how exactly do the codes for “customer” and “product” get managed across this panoply of systems?

The short answer is: with difficulty. The management of what is becoming known as “master data”, that is, things such as products, channels, locations, customers, organizational units (as
distinct from the business transactions themselves), is a giant headache. This inconsistency makes it hard to analyze business performance, but can cause many operational problems also, for
example, duplication of product lines. Most organizations are slaves rather than masters of their corporate data today.

Shell Lubricants was an interesting case in point. They had a broadly decentralized structure and had made acquisitions such as Pennzoil, itself with an extensive product line, and had in total
30,000 packed product combinations. When they wished to globalize their business processes, and also move more sales to the web, it was clear that they would need to improve the management of the
master data in this area, such as product codes, formulations, and brand names. Over the years local subsidiaries had created many local variations of products to suit local markets (a lubricant
formula that works well in the steamy heat of Vietnam may not flourish in a Norwegian winter and vice versa), but it was likely that not all these product variations were necessary. However
standardization to one universal set of master data was entirely impractical, as the costs of modifying the codes, safety sheets, packaging and brand data embedded in dozens of ERP and other
operational systems would be huge.

There were two major stages to this project: the first was understanding the problem, the second was using these insights to make operational changes to the business. Initially the team went
through a process of mapping each product against a set of criteria to see whether differences were really needed, identifying overlaps, and then come out with a new set of products. Local
management were challenged as to whether the local market variations were real or could in fact already be catered for by a product that existed elsewhere.

The next stage was to improve the operational business based on this insight. By the time they finished, the count of truly different products was just one fifth of the product count in the
operational systems. A sophisticated system was developed to manage the mapping between the new definitions and the multiple aliases of each in the various operational systems. By linking up this
global product catalog to an established EAI tool from IBM, product managers are now able to not just view, but also update product definitions across the globe, with changes driven back from the
master catalog into the operational systems.

It is important to understand that this is not just some trivial mapping of product codes. This is best illustrated by a true story. The lead consultant on the project was in a French refinery and
meeting with a gentleman who had the wonderful job title of “grease designer,” hoping to leave with a copy of his product portfolio. On the desk was a pile of papers a foot high. When asked
whether this was the product portfolio, he laughed and explained that this was the information on just one product, covering formulation, components, properties, test methods, material safety
sheets etc. This is the master data that describes this particular product that was made and sold, and all of this has to be managed.

The key to success of this project was both the business buy-in but also the ability of the technology used to accept that there were multiple “versions of the truth” due to the inherently
different needs of different business functions (manufacturing, distribution, marketing), i.e. that there was not just one single global definition of every product sold. This hard reality tends to
elicit a state of denial from three places.

Central CIO functions frequently feel uncomfortable admitting that the world cannot be represented on a single Powerpoint slide. They consequently prefer not to confront the business and admit the
extent of the problem.

The business users who are actually living with the lack of flexibility that the inability to manage master data brings, frequently don’t realize that there is a better way, and too easily retreat
into departmental stovepipes, blaming the problem on “those people in marketing ” (or manufacturing, or distribution, or sales).

Software vendors are uncomfortable because their installed base of applications works on the assumption that there is a single Holy Grail version of all data definitions and that any variation to
that is “legacy” i.e. something that can be quietly ignored. After all, designing an application that can deal with 11 separate definitions of the same piece of data simultaneously is hard. It is
much easier to expect the customers to wave a magic wand and simplify their world to fit the limitations of the software applications.

When working for Shell I went to a well-known ERP vendor and explained a problem that one of our businesses was having, which involved sharing and comparing data across different modules of the
software: the response was “our software cannot do that, therefore it cannot be a valid business requirement,” something that came as a surprise to the Shell marketing VP sitting next to me whose
problem it was. The only alternative generally available today is EAI spaghetti, with a range of interfaces that to describe needs something resembling a wiring diagram for a space shuttle rather
than a corporate IT architecture. Data movement should not be confused with data management.

Application software vendors, always quick to spot a revenue opportunity, have recently sniffed that there may be money in this area, so a range of “master data,” “data hub,” and other products
have been running off the presses of their slick marketing departments. However, while their products admit to the possibility that there may be more than one instance of their own applications
installed in customers (progress of a sort) they essentially skim over the inconvenient notion that a customer may have applications from different vendors. Of course there may be an agenda here:
“ah well, if only you’d throw out that ERP/CRM/supply chain system from vendor X and standardize on ours globally…”. Yet after a decade of vast global projects (for example one large company
spent USD 1.5 billion rolling out SAP globally – and no that is not a misprint for million), guess what: virtually every large company has a wide range of established applications each with its own
master data definition, even if those multiple systems implementations are from fewer vendors. Large organizations frequently have dozens or even hundreds of slightly different implementations of
the same software deployed throughout their global operations.

The way forward starts, as at Shell Lubricants, by an admission by the business that they have a problem: master data scattered across departmental boundaries and responsibilities. Cross-functional
business teams need to sift the legitimately different requirement from the “not invented here,” and decide what has to be maintained separately but linked due to genuine differences (e.g. in
local markets) and what master data can be cleaned up. IDC calls this business function a “policy hub.”

What is needed are applications that don’t try to fit the world around a simplified version of reality, but are based on the assumption that business models are indeed complex, and can probably
never be standardized due to the inherently different needs of different markets, customers and varying business needs. Instead technologies are required that manage the complexity of real-world
business models, and expect there to be 11 or 57 different definitions of core master data.

ERP systems don’t solve the master data issue since they assume a uniform, standard business model (which in the real world does not exist in large companies) and tend to deal only with data held
in their own system. Even behemoth SAP applications cover only 30-70% of the business model of most large companies and vary significantly by industry (the 70% claim was from their own CEO, which
still leaves 30% elsewhere).

Data warehouses are a useful step forward provided they can handle multiple, concurrent, linked business definitions. However they are not sufficient to solve the overall problem since an
application is then needed to deal with the workflow to support the business “policy hub,” that is, can deal with version control, authorization, security as well as analysis of the company’s
master data. The application needs to be able to deal with apparently “invalid” data that is effectively authored and evolved through a number of stages until it is reconciled; data warehouse
applications rightly reject such “invalid” data.

Master data management solutions are needed that allow customers to understand and analyze their multiple business definitions across the various source systems, propose changes to this master
data, get authorization where necessary, and then publish new versions of product catalogs and customer segmentation that can be fed into the core operational systems where appropriate.
Understanding this business process and realizing that there is quite elaborate workflow process required to do this still seems to elude most of our industry, as well as most IT departments.
Perhaps talking to business customers seriously about the problem might help. Just a thought.


submit to reddit

About Andy Hayler

Andy Hayler is one of the world’s foremost experts on master data management. Andy started his career with Esso as a database administrator and, among other things, invented a “decompiler” for ADF, enabling a dramatic improvement in support efforts in this area.  He became the youngest ever IT manager for Esso Exploration before moving to Shell. As Technology Planning Manager of Shell UK he conducted strategy studies that resulted in significant savings for the company.  Andy then became Principal Technology Consultant for Shell international, engaging in significant software evaluation and procurement projects at the enterprise level.  He then set up a global information management consultancy business which he grew from scratch to 300 staff. Andy was architect of a global master data and data warehouse project for Shell downstream which attained USD 140M of annual business benefits. 

Andy founded Kalido, which under his leadership was the fastest growing business intelligence vendor in the world in 2001.  Andy was the only European named in Red Herring’s “Top 10 Innovators of 2002”.  Kalido was a pioneer in modern data warehousing and master data management.

He is now founder and CEO of The Information Difference, a boutique analyst and market research firm, advising corporations, venture capital firms and software companies.   He is a regular keynote speaker at international conferences on master data management, data governance and data quality. He is also a respected restaurant critic and author (  Andy has an award-winning blog  He can be contacted at