Data is Like Contaminated Water

FEA02x - edited feature imageWhat if the water that flowed through the pipes in your house or apartment was contaminated and you knew that if you drank it or bathed in it, there was the potential that you would get sick? Like most home dwellers, you would invest in putting water filters everywhere in the case that you may be exposed to the contamination.

What if you owned a one-hundred-unit apartment building and the water was bad for all of the tenants? It is costlier to install hundreds of water filters to cover your liability. You may start thinking about putting a water filtration system in the place where the water enters the property. This would take care of the problem once and for all and would allow you to maintain consistent water quality for everyone in your building. This larger solution may cost a bit more, but it would greatly reduce your risk, improve customer satisfaction, and complete the circle of life. In the long run, it may save you money too.

Shhh … Don’t tell anybody but … Your data may be contaminated. It may be incomplete, inaccurate, untimely, un-integrate-able, and unprotected – or at least one of these things. Ask the people that define, produce, and use data in your organization (everyone?) if your data is safe – that is … formally defined, produced, and used across the organization. Then ask them how the data could be better. If your management knew the truth about the water, err … data problems, do you think they’d drink from that source? Most likely not. It makes sense that they would want to remedy the situation.

Your data is just like the water flowing through your personal pipes or into the homes of many people and functions in your organization. The data feeds and flows from your processes, decision-making, and ultimately the people of your organization. There are water … err, data problems, stemming from poorly designed data resources, acquisition of systems and other organizations, silos, and from majestic but individualized investments made in data warehousing, business intelligence, big data, smart data, and metadata applications – all investments of your organization’s hard-earned dollars. Your organization can choose to clean up the data problem whenever it rears its ugly head, like in the individual apartment unit example, or you can put a Data Governance program in place to systematically and consistently improve the quality, usefulness, and protection of the data.

Data Governance is the execution and enforcement of authority over the management of data and data-related assets. In simpler terms, Data Governance programs are put in place to make sure that there is systematic and formal accountability for how the organization’s data and information is managed.

I suggest that organizations follow the Non-Invasive Data Governance™ approach. That is also the name of my book. The term “non-invasive” describes the way data governance is exercised – 1) people’s roles are formalized based on their relationship with the data and 2) governance is applied to process and function rather than rewriting the way things are done. The NIDG approach is the most practical and effective approach on the market today. It is easier, but not necessarily easy.

Let’s go back to your home for a moment. Putting a PUR filter on your kitchen sink does not improve the drink-ability of the water in the bathroom. The filter in the fridge does not improve water quality in the bathtub. One-off solutions do not fix the overall problem.

Back to the office, the data quality tool used to clean data as it enters the data warehouse does not solve the data quality issues at the sourcing systems. Protecting who can see the data in one application does not prevent the wrong people from seeing the data in another application. Solving data quality problems only where they are exposed is like whacking a mole every time it’s head pops above the surface. It is a constant chase for the next mole, and the next data problem.

Data Governance, like a new water filtration system, focusing on cleaning up the contaminated water, … err, data on a systematic scale, is the only way to catch and destroy the moles that require a good whacking.

Moving forward with a formalized and Non-Invasive Data Governance program requires that somebody at a higher-level in your organization recognizes that your data is contaminated. They may not know how contaminated the data is, or what illnesses are resulting from the contaminated data, but that does not mean that we, as data practitioners or users at the very least, can’t let them know that we have put up with crappy data for so long.

Some Senior Leadership will get it, but some won’t. If you are in an organization that does not value the need to formally govern your data, maybe this anecdote about comparing contaminated water to dirty data will help them to get the message.

From time-to-time, republishes older popular content that is still relevant today. This oldie was published in June of 2016 as one of Bob’s It’s All in the Data columns.


submit to reddit

About Robert S. Seiner

Robert S. (Bob) Seiner is the publisher of The Data Administration Newsletter ( – and has been since it was introduced in 1997 – providing valuable content for people that work in Information & Data Management and related fields. is known for its timely and relevant articles, columns and features from thought-leaders and practitioners. Seiner and were recognized by DAMA International for significant and demonstrable contributions to Information and Data Resource Management industries. Seiner is the President and Principal of KIK Consulting & Educational Services, a data and information management consultancy that he started in 2002, providing practical and cost-effective solutions in the disciplines of data governance, data stewardship, metadata management and data strategy. Seiner is a recognized industry thought-leader, has consulted with and educated many prominent organizations nationally and globally, and is known for his unique approach to implementing data governance. His book “Non-Invasive Data Governance: The Path of Least Resistance and Greatest Success” was published in late 2014. Seiner speaks often at the industry’s leading conferences and provides a monthly webinar series titled “Real-World Data Governance” with DATAVERSITY.

  • Joe Celko

    I love the water pipe analogy! There’s another advantage of forgotten; if you put the filter at the input to the entire apartment building, then any new rooms, additions, equipment, hot tub, etc. also benefit from the clean water. It’s not just an immediate solution for an existing apartment; data quality as far downstream as you can get it is a safety feature for the future.