There are tried and tested solutions to their information quality challenges, which can enable the proactive identification and remediation of data quality issues – resulting in fewer operational incidents, fewer data quality related audit issues, better quality information for decision making and help the bottom line as a result.
What’s the problem?
In many instances band aids are applied to take care of information quality issues, rather than addressing the root cause(s). I realize that addressing the root causes may not always be possible, due to constraints related to legacy systems that are hard to modify, aggressive time lines to address business requirements, lack of skilled resources, budgetary constraints, to name a few. However, denying that information quality problems exist and putting our heads in the sand, won’t make them go away. We may not be able to address all problems, but can certainly solve a significant number, by addressing this head-on and using the tools and techniques at our disposal.
Some Guiding Principles
There are some guiding principles that should be considered while developing a information quality management strategy. A few are listed below:
- Apply strong information quality controls at the front-end of your information supply chain. Identifying and addressing data-related issues at the entry points into your business eco-system is much less expensive and prevents downstream impacts
- Focus your quality control efforts on business critical data – do not attempt to boil the ocean. Identify your critical business processes and the associated business critical data and focus your efforts on them
- Capture any data quality issues in a central repository – so that you have a view into information quality patterns, sources of information quality issues and costs incurred to address them
- For those systems that support web services, implement in-line information quality checks (via information quality services) to proactively identify issues
- Trust but verify – there is a tendency to assume that your systems have adequate controls to prevent data issues. This isn’t always true – especially in volatile systems. So, profile data at the systems-of-record and trusted data sources regularly to verify data quality
Here are a few suggestions on a simple process that you can use, to get started:
- Monitor: Continuously monitor and measure the quality of business critical data. A commercially available data quality tool can be used for this. There are many open source tools available as well
- Analyze: Analyze the information quality metrics to identify hotspots and data quality related issues
- Prioritize: Prioritize the hotspots and issues. Do not attempt to address everything. Use the 80/20 rule. Focus on the 20% that cause 80% of the pain in terms of business impact and resource utilization
- Continuously Improve Quality: Develop a process to triage such issues to determine the root causes and address them. Root causes typically fall into the data, process, and application buckets. The data and process related issues are easier to tackle than the application ones – especially if you are dealing with legacy applications.
There are two techniques that can be used to monitor and measure the quality of data – Proactive and Reactive information quality management. Proactive techniques are typically applied to “information in motion” and reactive techniques are applied to “information at rest”. For example, you may consider invoking information quality web services proactively from transactional systems, prior to persisting the data or while the data is being moved between data stores. Reactive techniques are targeted to legacy data stores or key systems-of-record that you wish to monitor from an information quality perspective. A data quality tool’s out-of-the-box data profiling capability augmented with custom information quality rules can be utilized for performing such scans. A combination of proactive and reactive controls should be used to identify bad data and remediate it. All data quality tools provide data cleansing, standardization, normalization, transformation and other functions to scrub data, to meet business requirements.
The guiding principles, methodology, and information quality management techniques provided in this article can be used by any firm that wishes to address the information quality challenge head on. The goal is to raise awareness about this emerging field, educate firms about tools and techniques that are at their disposal and some simple steps they can take to bring order out of chaos.