As a scientist, my first attempt at defining something starts by excluding what it is certainly not or what is misleading about the definition. What I’ve found is that many people misinterpret data governance. Recently I attended the Association for Institutional Research (AIR) Forum, and observed (again) data governance is commonly misunderstood.
The Institutional Research community is traditionally not as exposed to the topic of data governance as frequently as the financial services or healthcare industries. Yet, it is a true ‘business’ community with members of academic affairs and institutions. These are data citizens who are dealing with the same problems we do.
Let’s look at five misconceptions of data governance that I often see, and why they are problematic.
- Data governance is a published repository of common definitions. This is an incomplete definition of data governance. Of course, a common glossary is a foundational component of many data governance initiatives. However, a repository is only trustworthy if a meaningful and transparent process exists, and responsive ownership is in place to maintain it. Trust is an essential component of a successful democratic data governance initiative.
- Data governance is a concern of – and hence managed by – IT. This definition excludes the business side of data governance. Indeed, IT plays a crucial role in the underlying identification of authoritative sources and verification of their lineage. Yet the business as a consumer has an inevitable role in the certification of the business context on the data assets you manage.
- Data governance is just data quality (DQ) and master data management (MDM). It’s true that data quality and MDM are data management activities that must be governed. Yet DQ and MDM are about finding a mathematical truth for data in terms of quantifiable dimensions such as accuracy and completeness. Data governance goes beyond DQ and MDM by building trust in data which only human beings can qualify. Again, trust comes into the picture as an essential component of a democratic data governance intiative.
- Data governance is siloed by business function. Your organization may be extremely decentralized and geographically distributed. That doesn’t mean you can’t establish a coordinated approach to data governance among autonomous sub-organizations. Many organizations that are decentralized and geographically distributed, such as universities and global banks, have successfully implemented a shared platform. Moreover, organizations can gain a competitive advantage by having a broader perspective on the business as a result of global data governance.
- Data governance provides no value or participation for the data-consuming community. This definition is clearly wrong. Self-service BI tools empower more and more consumers to also produce data and reports for their own applications. Data governance policies help define how confidential data can be used and how to ensure data security and quality. If trust is an essential value in the holistic governance of data, then it should be grounded in transparency and equal participation for all data citizens, which necessarily includes the consumers of the data. All together, they are your sentinels who can identify data issues in a more granular way which the traditional monitoring could not.
So, I’ve just described what data governance is not. So what exactly is data governance? In my mind, it boils down to three key things: people, platform, and leadership. Data governance enables all data citizens to have a holistic lens on their exploding data universe. It provides understanding in terms of scope, commonalities and differences, business traceability, and data lineage. These people – these data citizens – need a platform for change. The platform must function as an operating system to define who and what are involved, as well as how they are involved. Stewardship sits on top of that platform, all aligned with metadata. Finally, you need to bridge the void between technology and data leadership by joining the power centers in the organization. The Chief Data Officer (CDO) is the leader of this data democracy and leads the charge in the data revolution.
To read more about what data governance is, please see my original post on the topic over on the Collibra blog.