From Bob Seiner’s first book, Non-Invasive Data Governance, we learned how to get the benefits of data governance without making major changes to our job roles or functions. We avoid the “command and control” approaches and still have people responsible for the organization’s data without reorgs or undue employee stress. That’s probably why Non-Invasive Data Governance has remained a bestseller since its release in 2014. If Data Governance was a New York Times book review category, I can safely say it would have been on the New York Times Bestseller’s list for the last decade. It is still selling very well today.
Non-Invasive Data Governance Strikes Again contains Bob’s experiences and perspectives gained since the first book’s release. There are 50 essays that are either experience-based or perspective-based. Lessons are shared from Bob’s experiences over the years working with many organizations on implementing data governance. From Bob’s own words on the difference between the two books: “The first book focused on selling data governance to your organization so that the higher-ups give the ‘green light’ to proceed with the program’s definition, delivery, and administration. The second book is about putting the necessary components of data governance into place to deliver successful and sustainable governance in our organization.”
There are very important messages throughout the book, and without giving them all away, I’ll share a few of my favorites, including “everybody is a data steward, and you must get over that fact,” “data governance is an evolution, not a revolution,” and “the data will not govern itself.” Bob reinforces messages such as these through many actual examples.
I agree with Tony Shaw, CEO and Founder, Dataversity, when he says in endorsing the book, that “…when Bob shares ‘experience and perspective’ in his new book, you are tapping into literally thousands of hours of hard work and creative thinking which have been applied in the real world.”
I’d like to share a small subset of one of the perspective essays to give you a sample of the book’s practical content and Bob’s easy-to-read and entertaining writing style. This excerpt, from Data Governance Challenges Associated with Large Language Models (LLMs), is used with permission from Technics Publications:
Large language models are artificial intelligence systems trained on massive amounts of data and capable of generating human-like interaction through text. LLMs use machine learning (ML) algorithms to analyze patterns in data and learn how to generate text similar to what a human might write or say. Using LLMs can create significant data governance challenges, particularly regarding data quality, data privacy, and ethical considerations.
Some of the better-known LLMs include GPT-3 (Generative Pre-trained Transformer 3), ChatGPT (same acronym applies), and BERT (Bidirectional Encoder Representations from Transformers), which have been used, with varying levels of skepticism, for a variety of applications, such as language translation, content generation, and chatbots. In this essay, I will address the relationship between data governance and LLMs and discuss some of the key considerations for organizations seeking to implement LLM technologies.
In the past, when I have written about my experiences implementing data governance programs, I typically focus on how the programs govern organizational data assets to ensure they are accurate, consistent, secure, and compliant with legal and regulatory requirements. I have written about data governance policies, procedures, best practices, tools, and technologies used to support these activities. Data governance is critical for ensuring that data is available when needed, is of sufficient quality to support decision-making, and is protected from unauthorized access or misuse. These same topics are relevant when LLMs enter the conversation.
LLMs are formidable tools that use complicated algorithms to identify patterns and relationships in data and language. LLMs have shown remarkable success in generating human-like text, leading to their adoption in a wide range of industries. Applying data governance to the use of LLMs presents organizations with new challenges.
These challenges include:
- Data Stewardship Challenges
- Data Documentation Challenges
- Data Risk, Privacy, and Security Challenges
- Data Quality Challenges
- Third-Party and Vendor Challenges
- Operational Efficiency Challenges
Since many organizations have not yet addressed the data governance challenges associated with LLMs, and to demonstrate the effectiveness of the technology, an LLM provided input into the details of the following challenges. Limited references to non-invasive data governance have been inserted, highlighting the point that the data challenges presented by LLMs are consistent across all approaches to data governance.
Bob goes on to explain each of these challenges. Get yourself a copy of the book to learn more!