Many Data Governance or Data Quality programs focus on “critical data elements,” but what are they and what are some key features to document for them?
A critical data element is any data element in your organization that has a high impact on your organization’s ability to execute its business strategy. An example is Customer Email Address, assuming your business primarily has correspondence with customers via email. These are key fields or data points in your organization that, if negatively impacted by quality, process, or overall management, would severely impair your ability to conduct business. Now we’ve got that out of the way, here are four things to document when cataloging your critical data elements.
1. Foundational Information
When looking at our element, we want to document some basic information. Here’s where we capture data type, format, and if any specific validation tables exist for it (think “Country Codes” as an example). Here, we may also consider documenting some basic data quality issues that we’ve noticed with this particular data element — things like “completeness” (is it filled in?), “accuracy” (does it correctly reflect the real world), and “validity” (is that date field a real day?).
While this may seem trivial to document, it provides for some easy lookup of information as it relates to our element. We’ll have our original concerns documented as things change over time, including the original specifications assuming systems of record or reporting environments change over time. When presenting or discussing our critical data element with others in the organization, this foundation will give everyone a sense of what it is and some broad strokes of how it can go wrong.
2. Key Roles
We need to consider who’s who in the zoo when it comes to our critical data element. Documenting some of the key stakeholders at play for our element is important, so we know where in the organization it fits and whose priorities are impacted by working with this data element. When documenting roles, it’s usually best to focus on the position rather than the person, but whatever makes sense for your organization’s culture is the most appropriate. Roles (with a brief description) include:
- Data Owner: The role accountable for the enterprise system(s) or business unit that manages the data in question.
- Data Steward: A subject matter expert, often the team lead who has a keen understanding or ownership of the processes that manage the data.
- Data Custodian: The folks responsible for following the processes, procedures, and guidelines relating to the data in question. Often, the folks responsible for entering the data and potentially fixing the data.
3. Key Lineage
Lineage is a key thing to capture as it relates to our element. Lineage refers to where our data goes and where our data comes from. Many software tools will capture lineage in an automated fashion, but we would like to focus on some key lineage components. Namely:
- Source System(s): The enterprise systems, spreadsheets, master data solution, and purchased data source for our critical data element.
- Reporting Architecture: Any sort of architecture that this element gets moved to from the source system, be that an operational data store, data lake, data warehouse, or other location where the data is moved to that isn’t the source system.
- Key Reports: Reports that use the data element in question, either in its original form or part of the calculation of a metric. This is often where decision-makers interact with and experience the data. This would include KPIs, quarterly reports, or any sensitive report.
The broad strokes of lineage are critical to capture here in plain language so it will be easy to communicate to the business the impact of the data element throughout the organization. This is key to answering the questions, “Why does it take so long to fix?” and “Why is this so complicated, it’s just addresses!?”
4. Other Supporting Links and Documentation
It’s critical to have a section to link to existing artifacts or controls within your organization. You want any sort of documentation to provide easy access to details without necessarily having them directly in the document (let’s keep this document to one page!). If you’re a fan of Danette McGilvray’s “Executing Data Quality Projects: Ten Steps to Quality Data and Trusted Information,” adding in documentation links for the following works very well:
- Root Cause Analysis
- Prevention Plan(s)
- Resolution Plan(s)
- Controls
- Communication Plan(s)
Conclusion
Whenever we look at Data Governance or Quality programs, considering key critical data elements is key to the success of these endeavors. Considering all data is just not feasible, focusing the organization on data that is critical to the success of the business is what will get your program to a sustainable state with plenty of engagement and interaction, building those champions for you. Also, with some key items documented, you’ll have a good understanding of what is important for you to consider when purchasing any sort of vendor solution.
Now, without further ado, go forth and document!