In April 2022, the Department of Homeland Security announced the “Disinformation Governance Board” whose ostensible mission was to stop the spread of propaganda by foreign state actors – i.e., Russia. Critics immediately questioned the legitimacy of this board and dubbed it the “Ministry of Truth” after the malevolent bureaucratic apparatus in Orwell’s 1984.
The public’s derision may have been a reaction to the board’s vague purpose and completely opaque operating model or it may have been a response to its leader’s somewhat eccentric tweets. Unfortunately, we will never know as the board was more-or-less dismantled within weeks of its announcement. Still, there is value in thinking about what this board could have been. The idea of governance over information (and data) at the federal level may have merit. In fact, it may be inevitable. For the purposes of this article, I will ignore the policy considerations associated with federal regulation of information ecosystems and I will frame the issue in general terms as follows:
How does data governance mitigate the risk of disinformation?
When we think about mitigating risk associated with disinformation, it is good to have a framework and when it comes to frameworks simpler is better. So, let’s talk about a few high-level governance concepts and place them in the context of disinformation.
First off, let’s identify who plays a role in our marketplace of ideas:
Leaders: establish rules that everyone in the marketplace must follow.
Producers: collect and share information in our marketplace.
Platform: processes information in our marketplace by applying the rules our leaders establish.
Consumers: use the information to understand the current state of the world, form opinions and beliefs, etc.
Now that we understand who plays what role in our marketplace, let’s think about the risk we are trying to minimize.
Let’s start by understanding what we mean by “disinformation”. Merriam-Webster defines it as “false information deliberately and often covertly spread (as by the planting of rumors) in order to influence public opinion or obscure the truth.” I will assume we all agree that accurate public opinion and unobscured truth are good outcomes from a marketplace of ideas, so now let’s talk about how we make it easier for our marketplace to root out disinformation.
Each role in the marketplace will need to take some form of responsibility for stopping the spread of disinformation. Clearly, the leaders in this marketplace need to articulate rules that can be used to identify and stop the spread of disinformation. These rules will be applied by the platform in such a way that producers are prevented from collecting and sharing disinformation and consumers are empowered to form accurate opinions and arrive at the truth.
Now it is time for our leaders to establish rules to minimize the spread of disinformation in our marketplace. From a data management standpoint, we can think of disinformation as low-quality information. So let’s use the DAMA’s model for data quality and pick a few dimensions that may be helpful in identifying disinformation and those who produce it. This will make it easier to quarantine disinformation from our marketplace and prevent it from being used to make important decisions. Here are a few relevant dimensions:
Credibility: the degree to which data values are regarded as true and believable by data consumers.
Plausibility: the degree to which data values match knowledge of the real world.
Reputation: the degree to which data are trusted or highly regarded in terms of their source or content.
Validity: the degree to which data values comply with rules (we can also think of this dimension as whether the data accurately represents reality or a verifiable source).
So now we have several attributes that can be used to identify disinformation. The next problem: how do we apply these rules? Once again, we turn for our framework and try to understand each role’s responsibility for applying and following marketplace rules.
Data Quality Metrics
When it comes to data quality, the most effective platforms will incentivize better quality at the source – i.e., in our marketplace of ideas, this would be the producer who collects and shares information. This can be done by building in quality checks as part of the platform’s architecture and by educating producers on the downstream impact and use of high-quality data.
Another powerful way of incentivizing data quality is to score it and show it so that the context of the information is very clear to consumers. Our platform will need a scoring mechanism that factors in things like type of data, usage, level of validation checks to score it and then show that scoring transparently so that the consumer knows what they are dealing with. This is where our data quality dimensions can be leveraged to measure and publish the credibility, plausibility, reputation, and validity of the producer and determine whether the specific information they share is “disinformation”.
There may be times where the consumer will need information to make a very important decision like electing an official or going to war. In these cases, the consumer will need to find high quality information – i.e., they will need to “discover” the information. In modern data environments, this discovery is enabled through a data catalog. It will be up to our marketplace to make sure that mechanisms are in place so that producers publish the intended use of their information and platforms publish data quality scores that enable the consumer to make decisions around what information is right for them. The catalog can also be leveraged to understand what rules apply to various types of classes of data.
There may be times when the data produced in our marketplace is not sufficient to meet the producer’s purpose – i.e., the information is not of a sufficient quality. It these circumstances, then the platform should establish a pipeline for increasing the quality of that data. This is one function of “DataOps” and the folks who run the platform are typically in the best position to put the producer and consumer together to address quality issues. If we are talking about very sensitive information, leaders may consider requiring that certain types of information have an actual “approval stamp” from the producer before it moves further in the pipeline. For example, if the information is of “medium” quality, then it should not be used by the consumer to make an important decision. By apply our disinformation rules, we will be in a much better position to prevent disinformation from entering our pipeline. This way, it will not be used when consumers are making important decisions.
Now that we have a few rules and we understand how those rules are applied, we can talk about whether our marketplace can be “trusted”. Trust is established when a system demonstrates the ability to enforce policy. Whether it is a computer system trusted by a user or a government trusted by a citizen, the individual feels safe to be in that environment because harmful or unauthorized activities are controlled by rules (i.e., “policies”) that are made public and consistently enforced. Stated another way, we trust a system when all reasonably foreseeable harms have been identified and controlled via the consistent enforcement of public policy. For a system to remain trusted it must be governed in a manner that ensures the ongoing effectiveness of these controls. We get these types of ideas from well-established concepts that apply to corporate governance.
There are three primary ways our marketplace of ideas will earn the public’s trust:
Organizations must be able to clearly communicate capabilities and reliability deliver capabilities according to some quality standards to maintain users’ trust. The platform’s role is to balance the need for user understanding and acceptance against any increased security risk arising from too much transparency into the innerworkings of their systems. The government’s role could be to ensure that digital policies are developed in a transparent manner, with opportunity for public input and in compliance with regulations that define public good. Part of this will be sharing elements of architecture – which must embed safeguards by design to prevent abuse and misuse and to provide users and participants with fundamental control and transparency over their digital life. These types have concepts have already been articulated in European regulation (see GDPR Art. 12 Transparent information, communication, and modalities for the exercise of the rights of the data subject).
Responsibility for ensuring that platform requirements are defined, understood, and implemented should be formalized in a document, typically a charter or an operating model. Just as we did at the outset of this article, understanding and assigning accountability starts with identifying the roles who manage the data that is identified as high value or high risk.
Many (if not all) of the major data platforms today rely on some form of Artificial Intelligence (AI) when serving up information. These AI-powered systems are based on probability and uncertainty. The correct level of explanation is key to helping governments and users understand how the system works. Once a clear mental model of the system’s capabilities and limits are established, they can understand how and when to trust it to help accomplish their goals. Several considerations must be made by providers regarding how and when to explain what AI does, what data it uses to make decisions, and the confidence level of a model’s output.
The Disinformation Governance Board was an interesting idea that never got off the ground. In all likelihood, its failure started with a lack of transparency which quickly led to a terminal inability to establish public trust. In the same respect, increasing attention to global information ecosystems may eventually lead to the same unfortunate conclusion. Any marketplace of ideas that seeks the title of “town square” should earn the public’s trust by demonstrating a clear commitment to eliminating disinformation through increased transparency, accountability and explainability. This trust will be rooted in a simple governance framework where rules are established by well-informed leaders, consistently enforced by the platform, and clearly understood by the producers and consumers of information.