Recently, I was working with a colleague on a data governance project. When it came to defining the data governance policies, we found ourselves questioning what the meaning of a policy was and how was it different from a rule or a standard or a procedure. There is a lot of confusion about data governance terminology.
[Publisher’s note: This is the final column from Robert Lutton of Sandhill Consultants. TDAN.com would like to thank Robert for his long-time contributions to this publication.]
Governance policies, guidelines, best practices, rules, and standards tend to be complicated to unravel because they are so closely related. It is not uncommon to find data governance documentation where the governance policy is a mixture of data principles, policies, procedures, and standards. It is important to understand the distinction between various governance terms before engaging in building a data governance program. Perhaps there needs to be some governance around data governance. Let’s start with the fundamental underpinnings of data governance.
A data governance program needs to be built on a stable foundation that will drive the ability to make accurate business decisions. Data is the primary element that business decisions are based on. What data is required and how it is treated depends on certain motivational factors like, what business conditions exist within the business’ operating environment that cause it to change or adjust course; what value does the business place on its data; what business objectives need to be met for the business to survive and grow. That stable foundation is constructed of principles, goals, and drivers.
A principle is a fundamental truth or proposition that serves as the foundation for a system of behavior or for a chain of reasoning. Principles are more than just high-level philosophical statements. Principles are stable and do not change often, which is why they are important. The business is always changing and adapting to various factors, but principles are like a lighthouse grounded on solid rock with a consistent flashing beacon. The weather may change but the light is constant. One of the most fundamental principles is that data and its derivations like, information and content are considered an asset. Without this basic belief why would we go through the effort of managing and securing it in the first place? Successful data governance requires the documentation and agreement about the value of data from all participating parties.
A data governance program must derive value for the business. Therefore, data governance goals need to be linked to business goals. A business goal is the action of implementing change in the organization to achieve a strategic outcome. These are specific considerations defining high-level factors influencing all decision-making and planning in the organization. Business goals generally fall into two categories, decrease cost, and increase profit. An example of decreasing cost might be to reduce the time it takes to find information. An example of increasing profit might be to improve data analytics to better be able to predict market trends. Business goals influence data strategy and data governance by providing direction. Direction informs the data strategy needed for the acquisition, organization, analysis, and delivery of data in support of business objectives, while communicating the scope, responsibilities, and process needed to govern that data.
Business drivers are contributors to business goals, but business drivers are based on business conditions. A business driver is a factor that currently exists or may exist in the future causing a business to take a course of action. These factors may be associated with internal or external forces applied to the business. An external force might include competition, regulations, economics, and other considerations that put pressure on the business to adapt and survive.
External drivers are not optional, there is usually a consequence for not complying. For example, privacy and security compliance regulations are a big factor driving the need to govern and manage data effectively. Internal drivers are factors that support the work done by the organization. An example of an internal driver may be the inability to produce a verified definition of a customer that would help manage the nuances between the meaning of a wholesale customer and a retail customer that would provide better insight into customer information. Like business goals, business drivers also influence data strategy.
Data governance polices are at the heart of data governance. Data governance policies are statements of intent designed to influence business behavior. Those statements are based on the drivers, principles and goals stated earlier. Basically, policies convert business motivation into general assertions about how an organization is intended to behave in accordance with the data assets being governed.
Policy driven behavior is supported by controls, measures, and activity. The important thing to understand is that policy is not data management. Stated differently, policy documents what should be done and rules, standards, and procedures document how it is done. Stewardship is the operationalization of rules, standards, and procedures within the various contexts of the policy (see figure 1).
While data policies provide guidance on what should be done, activity documents the procedures on how it is done. Procedures are a set of documented instructions that define the specific steps required to achieve a desired output. In the case of data governance, the output reflects the policy objectives which influence the procedures for addressing data related issues. Guidelines are procedures that are accepted or prescribed as being correct or most effective. Best practices are essentially guidelines that have provided consistent and acceptable results over time.
Following an accepted procedure is a means for obtaining consistent results. Consistent or expected results can be monitored to show a level of compliance. If a gap exists between what is expected and what is actually delivered, monitoring this gap will help determine if the procedure is working or needs modification (see figure 2). Procedure artifacts may be in the form of a numbered list, checklist, or a process flow diagram.
Data standards are more detailed artifacts that act like a measuring stick. When you want to know if something conforms, you measure it against the standard. You’ve probably heard the phase ‘a standard of excellence’ while watching a luxury car commercial. A standard assumes a level of quality or attainment. A standard may be internal or external. An internal standard might be allowable values for determining a status level. External standards may be part of an international standard like the ISO 11179 set of standards for metadata.
A rule is an officially recognized regulation by which activity is governed. Regulatory and compliance requirements are usually the main drivers that put these rules into existence. Organizations that deal with finance, healthcare and insurance are especially susceptible to external rules. Rules directly influence procedures because they set the boundaries of what can and cannot be done with data. Data Stewards, who enable the governance work, consult the rules when following the procedures to ensure compliance (see figure 3).
Understanding the differences between these governance artifacts allows us to follow how motivating factors like drivers and goals influence the policies needed to modify the behavior, which is codified in rules and procedures, that enable the stewards to do the work required to apply governance to data. Having a separation of duties allows us to adapt to change easier. A single policy may have an association to many different downstream rules, standards, and procedures and upstream drivers, principles, or goals. When a change occurs to either the policy or its associations it is much easier to change the specific item than to remember every policy where the rules, standards, and procedures are embedded and change each one independently.
Software developers refer to this as loose coupling. As automation becomes more integrated into systems, the need to accurately define how the pieces fit together becomes more important. Both Artificial Intelligence (AI) and Machine Learning (ML) are dependent on understanding the meaning and consistency of data to discover patterns. Automating data governance involves using rules engines, workflow engines, and data quality engines to systematize the components into a kind of data governance management system.