
This column will expand on a Systems Thinking approach to Data Governance and focus on process control. The vendors of myriad governance tools focus on metadata, dictionaries, and quality metrics. Their marketing is a sea of buzzwords and bling — bells and whistles. Yet, where is the evidence of adding actual business value, defined as return on investment (ROI)? By what percent did profits increase, or will overall expenses drop? Was there a quantifiable reduction in enterprise data risk metrics? The technical governance landscape is essential, but only a means to an end. Indeed, it is not the desired end state.
Systems Thinking teaches that Data Governance is first and foremost a process control system. We know of business processes and process optimization efforts. Data is associated with all business processes. Years ago, people used a CRUD (create, read, update, delete) matrix to show the relationship between process and data explicitly. Companies have extensive data processing capabilities to move this data between systems and reporting engines, all of which support key business processes.
Businesses have operational process controls. Departments like Compliance, Accounting, Audit, Records Management, and Risk Management are tasked with controlling, measuring, and reporting on the health or profitability of business processes. Moving beyond governance buzzwords requires us to acknowledge that data management and reporting is a complicated process that runs in tandem with business operations. Effective Data Governance must recognize this combined business and technology process infrastructure and apply the necessary controls to maximize ROI.
We can follow several models for insights on how to implement a data process control framework. Years back, I spent time in several petroleum refineries with British Petroleum and Standard Oil. My mentor was a mechanical engineer who taught me the fundamentals of process management. The key was continual feedback loops at every stage of the refining process. These loops monitored key metrics (temperature, pressure, viscosity, etc). Computer advancements triggered alarms to notify when a process was out of balance. This type of process control is everywhere around us, and like data, it runs in the background until there is a problem. The dashboard in your car is an excellent example.
The flow of data from source systems, accounting systems, sales systems, billing systems, and reporting systems is nothing but an integrated data flow process, just like the oil in the refinery, which started as crude oil and ended up unleaded gas. In the case of business, raw data enters= systems via business processes and is transformed into information via calculations, reporting, predictive analysis, or placed into the context of a business event. Various engineering disciplines know this, and we can follow decades of lessons learned in other industries to implement the optimal governance control framework.
Like the refinery engineer, organizations need continual data quality feedback loops to ensure data health and inform business decisions. At the same time, data quality tools assess overall quality but often lack the business-driven governance awareness needed for comprehensive insights. Data quality is a complex metadata problem that requires a metadata-driven quality framework. While data quality tools can automate checks, an integrated governance framework must know the relationships between source systems, data domains, ownership, and compliance. Such a framework enables predictive analytics and informed decision-making.
The most effective data quality process controls pay for themselves many times over. In one case, a company used automated data-checking with a data quality tool to improve operating efficiency and reduce data risk. Previously, multiple departments reconciled data, but failed to communicate the results to all stakeholders, leading to quality issues. With a new process, verified results were distributed daily, achieving a 20 percent gain in reporting efficiencies. This got senior management’s attention.
Over time, the company found the need to upgrade and replace aging systems. The data quality team was asked at an executive meeting, “Which source systems generate the most errors?” Unfortunately, the data control technology lacked statistics for this query. With no way to leverage past results to predict future failures, the team had to review past reports manually. This case illustrates the limitations and risks associated with standalone quality tools. An integrated governance approach looks at problems at a system level. The governance goal is total awareness across the organization to enable informed business decisions.
This system replacement team needed to be able to answer the following questions based on awareness reports against the company’s technical and governance metadata:
- What compliance or regulatory risks are associated with data?
- What source systems generate the most data quality errors?
- Which organizational groups generate the most errors?
- Where are all data copies across the data landscape?
- Which systems contain confidential or PII data?
- Are data consumers of all domains known?
- Do we have a valid business glossary?
Imagine the value of querying an underlying metadata repository to be able to answer this list of questions. This is a far better solution than searching SharePoint for legacy project and compliance spreadsheets where the information had previously been logged (and lost). Companies can implement governance as a process control system with a technology solution that integrates metadata to get critical answers about source, use, risk, ownership, and data quality at their fingertips. Imagine the value of going into a discussion on budgeting or system planning with this level of knowledge. This awareness can also advance company operational and risk management initiatives.
Systems Thinking emphasizes ongoing feedback loops, while Lean Thinking optimizes inventory management. Lean Governance combines these approaches to gain control over data and profit margins through understanding that governance is an awareness issue solved with the right mindset, methodology, and technology.
There is an urgency to focus on process-driven governance and controls. We are bombarded with blogs about how AI demands quality data. The shifting Data Warehousing technologies are adding a new level of risk. The classic ETL approach is being replaced with ELT, where data is pumped into large storage databases. The burden of query logic is pushed from IT to the business. Traditional primary and foreign key constraints are often not enforced, raising the risk of data errors. Quality feedback loops solve the problem.
Our next column will explore how companies leverage governance awareness to navigate their desired corporate future.