Published in TDAN.com January 2004
A large portion of the work my company does for our clients involves the coordination of data quality management across application teams, systems, or even the entire enterprise. Because of this, I
am very happy to see an increase in the attention being given to data quality. An interesting metric is the count of articles and columns in the popular press that talk about the growing
industry-wide recognition of the value of high quality data. Numerous reports issued over the past two and a half years indicate that information quality is rapidly increasing in importance among
senior managers:
- According to the 2001 PriceWaterouseCoopers Global Data Management Survey, 75% of the senior executives reported significant problems as a result of defective data.
- In 2002, the Data Warehousing Institute published its Report on Data Quality, in which they estimate that the cost of poor data quality to US businesses exceeds $600 billion each year.
- A 2003 Cutter survey reports 60% to 90% of all data warehouse projects either fail to meet expectations or are abandoned, with an underlying implication that some of this is a result of poor
data quality.
As more information is being used for multiple purposes, exchanged, and as more business processes are being automated, there is a clear return on the investment of data quality improvement. The
emergence of the value of high quality information has prompted many organizations to consider instituting a data quality management program, either as a separate function within a logical line of
business, or even at the enterprise level. While this is admirable, there are a number of relevant issues that can impede the integration of information quality concepts as part of the managerial,
operational, and technical aspects of the enterprise. Some these critical issues include:
-
Data ownership within the enterprise. Data ownership is a very complicated notion, and can contribute to significant strife within an organization. Often those managers or
technicians entrusted with the implementation of an application infer an ownership of the information used within that system. This introduces potential conflicts when these individuals are
expected to both participate in enterprise-wide data quality initiatives and to expose the internals of their information management to data quality audits and reviews. -
Data management within vertical lines of business. Traditional organizational structure will impose a structure in which the line-of-business (LOB) management chain has authority
over the information used within the LOB, and consequently each LOB will have its own requirements for data quality. However, once the LOB data is to be used in ways never intended by the
original implementers, it is possible to introduce new requirements that may be more stringent than the original, yet there may be hesitation on behalf of the LOB managers to invest resources in
addressing issues not relevant within the vertical business application. -
External data quality management. Another issue associated with the vertical organizational structure lies in the question of the degree to which a data quality professional can
encourage improvement once the authority over the data lies outside of an administrative arena. -
Data quality tools as the answer. A frequent response on behalf of senior management with respect to building a data quality management program is to immediately begin to
research the purchase of automated data cleansing or profiling tools. While some data quality tools do provide some benefit right out of the box, without a well-defined understanding of the
nature of the types and scope of specific data quality problems, and without a management plan for addressing discovered problems, buying a tool will not have a significant affect on achieving
long-term strategic information improvement goals. -
Assumption that data quality is a technical, not a business problem. Frequently, business clients assume that any data quality issues are IT issues, and should be addressed by
the technical teams. On the other hand, if the business rules with which the data appears to be noncompliant are associated with the running of the business, shouldn’t those rules be owned
and managed by the business client? -
Problem scoping issues. Anecdotal evidence may inspire attitudes about requirements for data quality, and arbitrary spot checks on the data by the technical team are often the
accepted manner of analyzing data quality. But in the absence of a true understanding of the kinds of problems that take place, the scope of the problem, and the impacts associated with the
problems, how can a manager determine the proper approach to fixing the problem as well as determining the levels of improvement? -
Reactive vs. Proactive data quality. Most data quality programs are designed to react to data quality events instead of determining how to prevent problems from occurring in the
first place. A mature data quality program determines where the risks are, what the objective metrics are for determining levels and impact of data quality compliance, and approaches to ensure
high levels of quality.
When boiled down to the core, data quality management is a horizontal activity, which is typically introduced into a vertical organization. And the failure of a data quality management program may
derive from a simple observation: the persons entrusted with ensuring or managing the quality of data usually do not have authority to take the appropriate steps to improve data quality. Instead,
data quality management may exist as advisors to line-of-business data “owners,” with the task of inspiring those owners to take on the responsibility for ensuring the quality of the
data.
If the intention of the data quality management program is to influence system management behavior to integrate ongoing data quality improvement, what is the best way to build the program? Clearly,
there are risks in being too timid in approaching the topic with the de facto data owners, as this will never gain any serious momentum. On the other hand, there are risks in being too aggressive
in introducing data quality concepts, which may appear as intrusive and challenging to application system manager authority.
One approach with which we have had some success is to incrementally introduce components of a previously-architected data quality management solution. Before any activity is started, have a well
thought out plan for its justification as well as a plan for measuring improvement after the activity is in place. Each newly introduced activity should be done in a manner that establishes the
value of the activity, and consequently encourages compliance, thereby gaining incremental acceptance and success. In this way one can introduce and document best practices associated with both
departmental and enterprise-wide data quality as well as provide guidance for the data quality manager to coordinate the integration of these best practices into the different departments within an
enterprise.
While it is unlikely that any individual would specifically disagree with any of the data quality concepts that constitute an effective improvement program, that concept does not necessarily
guarantee that individual’s participation. However, by providing a clear business case that demonstrates how specific data quality issues impede the stated business objectives, as well as a
discussion of the steps that need to be taken to address the problem, the decision to introduce the improvement should be very clear.
An incremental approach to introducing change will work when there is a clear strategic plan for introducing concepts in a controlled sequence and that the value of each concept is well-defined and
builds on concepts previously introduced. The ultimate goal is for those managers involved to reflect at some point in the future and marvel at how much has been measurably improved. In an upcoming
column we will drill down into the components of the strategic data quality blueprint and how data quality concepts can be incrementally introduced to build a long-term data quality management
program.
Copyright © 2003 Knowledge Integrity, Inc.