Published in TDAN.com July 2003
There are different ways of looking at how an organization makes use of information. For the most part, data is basically seen as the “raw input” that fuels a number of the basic
operation of an organization. Many data processing activities result in what we could call “operational by-products” – accounts being settled, invoices printed and mailed, pick
lists generated for shipping manifests, etc.
Information practitioners have traditionally been entrusted with the implementation of these operational tasks, yet the business of business is more likely to have been assigned to a business
client, who not only directs what the technologists were to do, but also controls the budgets that feed the technology machinery. When IT staff members perceive an opportunity to improve the way
the business is run, the typical interaction involves trying to convince the business clients of the value of doing something new, differently, faster, more efficiently, and so on. Alternatively,
when the business clients want to improve the business, they often end up trying to convince those IT folks that it is possible to change, even within the current environment.
The message is simple: the way we have done business in the past has imposed an adversarial subtext underlying the business-IT relationship, and this adversity hides behind a large part of the
information governance (or more likely, lack thereof) within any group’s data management bureaucracy. However, there are new and different ways to think about information, in which a
company’s data is seen as an asset that can be manipulated in different ways to create opportunities for creating new wealth. New technologies coupled with creative business thinking pave the
way for successful analytics, knowledge discovery, and general business intelligence activities.
A New Way of Thinking revolves around a simple concept: use information to improve the business instead of just running the business. Whether this means creating a data warehouse
that feeds OLAP analysis, or data mining for the purpose of predicting customer behavior, or whether it just means providing better quality data, ongoing monitoring, or streamlining processes, it
requires some new thought processes surrounding the use and management of data. Clearly, those organizations that have taken this new approach are able to move ahead of their competitors in terms
of competitive advantage.
My motivation in writing this new column for TDAN.com is to help figure out the best ways that we can inspire influential people within an organization to adopt a new way of thinking about
information. I believe a good way to do this is to challenge some of the conventional wisdom by looking at different kinds of business problems and explore how they are related to traditional
information management techniques. I’ll then suggest some ideas to think about that should help reframe the problem outside of the traditional context. Also, in each column I intend to pose a
question to the readers related to the topic as a way to stimulate reaction and start the thought process going. Last, in each subsequent column we’ll review the responses and see what we can
learn from collective experience.
Now, on to this column’s topic:
Cost, Game Theory, and Responsibility
We have all heard the numbers: 70-80% of the costs associated with building a data warehouse are related to data integration and data quality; Scrap and rework attributable to poor data quality
accounts for 20-25% of an organization’s budget (or some even say, gross revenues). Whether these numbers are derived from experience or whether they are apocryphal yet have been quoted (or
misquoted) repeatedly for a long time, we just take these kinds of statements as truth.
And these numbers seem to make sense. We know that a large part of a data warehouse project is the identification of the different data sources, preparing those data sets for transformation and
loading, and integrating information from disparate data sets – this accounts for the “70-80%.” As for the numbers related to scrap and rework, we all know that when a mistake is
made, results are thrown out and the process needs to be redone. I am not even going to dispute these numbers – we will just assume they are valid.
So let’s presume that a significant percentage (we will be conservative: 10%) of a company’s revenues evaporate due to various and sundry broken processes that create or propagate low
quality information throughout the enterprise. For example, a medium size company with annual revenues of $25 million is losing $2.5 million due to poor data quality. If this is true, then why
wouldn’t every CEO and CFO be screaming that the company should initiate a data quality improvement program? And while the topic of improved data quality has gained a lot of momentum in the
practitioner space, it still has not bubbled up to the top of the stack on the CEO’s desk.
What are the costs associated with information scrap and rework? According to Larry English (in his excellent book, “Improving Data Warehouse and Business Information Quality”),
- Redundant data handling and support (associated with the collection and management of multiple copies of the same information)
- Costs of hunting or chasing information (when knowledge workers have to track down missing information)
- Business rework costs (when business processes need to be repeated because of errors)
- Workaround costs (accumulated as lost productivity)
- Verification costs (when knowledge workers need to verify information from different sources in order to trust that information)
- Software rewrite costs (attributed to fixing and recovering from failed programs)
Although these are all valid costs, there is a deeper business problem inherent here in the fact that the parties responsible for the introduction of low quality information are not necessarily the
same parties affected by the manifestation of costs related to that “bad” data. In turn, those tasked with improving the generated data product do not really address the real source of
the problem, they are actually only treating the symptom.
A good example involves any data cleansing that takes place during the Extract/Transform/Load (ETL) process associated with data warehouse population. A large part of the data cleansed during this
process is not owned by the technicians tasked with the cleansing. Yet, for each of the data suppliers, the quality of the information is probably good enough for the original purpose, and so
unless the suppliers are also clients of the data warehouse, there is little motivation to invest their own budget to implement any specific data quality improvements that only benefit other
In fact, when we look at the scrap and rework costs enumerated by Larry English, very frequently the costs attributable to the poor information quality is borne by parties other than the ones
introducing the low quality information. Workarounds, chasing information, and verification are even understood to be part of some job descriptions, and the business clients are not even aware that
they are reconciling someone else’s data problems. And in those organizations where there is a clear recognition of the problem, unless any single cost is both significant and is borne by the
same group that holds responsibility for that data, it is not likely to be remedied without pressure from centralized senior management responsible for the organizational information resource.
This, and similar scenarios, remind me of the little that I have learned about game theory. Intra-corporate battles over who pays for improvement are similar to the “zero-sum game,”
where (loosely defined) a move that benefits one party is detrimental to the opposing party to the same degree. Added on top of that, companies that are successful can hide the pain associated with
data quality and essentially deny its relevance. A colleague mentioned to me that he had heard senior managers say that their company was doing well and making a lot of money, so they didn’t
see any way that poor data quality could be affecting the business.
In order to influence any one party to invest in data quality improvement, it must be shown how that improvement is going to benefit the stakeholder. It is this idea that has necessitated the
development of the return on investment (ROI) arguments for data quality improvements. But even with a reasonable ROI argument, implementing enterprise data quality improvements are still subject
to many other issues aside from data ownership, such as organizational behavior, job security, power struggles, and cross-group coordination, among others.
This effectively establishes the line that distinguishes forward-thinking senior managers (particularly CIOs and CFOs) from their less strategic colleagues. A strategic CIO will recognize that the
value of the organization’s information asset is significantly increased when all risks associated with poor data quality are removed. The tactical CIO will wait until the problem occurs to
start worrying about fixing it. What kind of senior managers rule at your company?
Question of the Quarter
In the interest of drawing out more information on the topics covered in my column, each issue I will pose a question to the TDAN.com readers. At the end of each quarter’s column I will review
your answers and explore what can be learned from reader experiences. Please email your responses to email@example.com.
This quarter I have a two-sided question: For those of you involved in a successful data quality improvement project, what was the business motivation that necessitated the creating of the data
quality and/or stewardship program? And for those of you where the program has not yet gotten off the ground: what has prevented those controlling the budgets from investing in the program?
Copyright © 2003 Knowledge Integrity, Inc.