Author: David Loshin
Publisher: Morgan-Kaufmann, 2001
ISBN: 0124558402
Within the realm of Business Intelligence, an amazing amount has been written about design strategies, tools, applications such as analytic frameworks, and more. One area almost ignored but by a
brave few is Data Quality. It’s not difficult to understand why this has been avoided. Every situation is unique; it is not possible to explain what to do in every circumstance one might
encounter. This being the case, many authors write about the “easy” stuff – design approaches, best practices, pros and cons of various third party products, and so on.
Being a firm believer in the concept that “it’s all about the data,” I was thrilled when I heard about David Loshin’s Enterprise Knowledge Management (The Data Quality
Approach). After picking it up and reading it, I found this to be a true gem – a reference document that covers virtually every aspect of Data Quality, and one that does it with copious
examples and excellent readability.
I am quite serious when I describe this book as a reference. The author takes the reader through the entire lifecycle of data quality – what “quality” data is, who owns it and has
responsibility for it, how to assess your data and measure ongoing quality (the first time I’ve seen this treated in great detail), root cause analysis, metadata and, finally, an entire
chapter on “Building the Data Quality Practice.” Mr. Loshin treats data management as a chain of activities, striving to communicate how to tell the good from the bad, how to manage the
remediation of bad data and how to provide process improvements to encourage the proliferation of good and accurate data.
In a chapter particularly interesting to me (Chapter 6 – Statistical Process Control and the Improvement Cycle), the author applies statistical process control (SPC), long used in
manufacturing to provide high quality components, to the process of “manufacturing” data, which truly happens in almost every modern computer application. SPC removes the emotion and
opinion from measuring data quality and provides an excellent resource for measuring the effects of a quality program.
At the same time, it is a mistake to assume that any one chapter in this book will enable a firm to resolve their data issues. Data quality is more a journey than a destination; this book is a
resource that can aid in mapping the best possible route. But to accomplish that, you’ll really need to read it all. Then you can determine where to apply the insights gained.
At almost 500 pages, this is no weekend paperback. But the time spent digesting the experience represented here is well worth the effort expended. I highly recommend this become one of your
standard reference books, alongside Kimball and Inmon. Enterprise Knowledge Management (The Data Quality Approach) will easily become a well-used and highly trusted resource.