Data Does Not Matter

BLG02x - image - EDThe amount of data is growing exponentially. This creates an enormous potential for generating a lot of new information. Yet, all this data and information is totally useless if it does not help people. People are the only thing that really matters; everything else is just a means to an end. This may sound obvious, but it has severe implications for data management. In this blog I will highlight some of these implications.

As everyone knows, data is just a means to generate information. Information should provide people with answers to important questions that they have. If you really want to help people, then you should also try to really understand what they need.

Step one is understanding the question they provide you with. Understanding what people say requires a lot of contextual information, such as the processes that they want to support and the specific issues that they are dealing with.

What you really want to know is the question behind the question, because in practice, the questions that people formulate may not fully reflect what they really need. Maybe they do not oversee their own context, their position in it, or what common practices are in similar situations. Also, questions themselves expose a lot of information about the assumptions that people have. If for example someone asks you whether you can generate a certain report about the performance of a certain process, this indicates that there are certain issues with this process. You should at least try to understand what sort of issues exist, because maybe the report is not the solution for the issue.

If you think you truly understand the question, you should try to understand which data is needed to provide an answer to the question. Datasets and/or applications are hopefully available that can provide the data that is required. An important step in selecting data and their sources is understanding what data quality requirements are essential. In the end, garbage in is garbage out. Generating answers based on low quality data leads to low quality information. From a data quality perspective, the primary criterion is “fitness for use”. This means determining the quality requirements based on the needs of the user. This may require an even deeper understanding of the process that needs to be supported, and what information will be used in the process. Is it a strategic, tactical, or operational process? What is the impact of information that is incorrect? Is there a fallback for the information that you provide? This may also require involvement of subject matter experts with respect to these specific processes. You can ask these people what the impact is if data is not accurate, not complete, not consistent, not actual, or not plausible.

The last important step is determining how to present the information. Should you simply generate a report, provide people with some form of a dashboard, or are specific visualizations needed? This is very much dependent on the information that you need to convey and the profile of the target audience. In general, a picture can tell a lot more than a thousand words. Data visualizations are becoming increasingly important. The problem is that there is such a broad range of types of visualizations from which you can choose. Infographics are also gaining interests, and can also be generated automatically. Nowadays we have much more interactive ways to bring information. We can allow people an interactive experience through data in the form of rich user interfaces and/or mobile apps. Interaction design is becoming increasingly important, especially in the context of data management. It is important to allow easy consumption of the information and to ensure that the proper terminology is used towards the user. Language is very important in communication; it is a direct reflection of our thinking process. Use terms that people understand – terms that are a part of their everyday terminology.

I have provided you with a number of suggestions on how you can truly help people with data and information. Understand what they really need, ensure the data quality fits the use, and think thoroughly about how to present the information. I believe this will help you to become a more effective data professional.

Share

submit to reddit

About Danny Greefhorst

Danny Greefhorst, MSc., is a Principal Consultant and Director of ArchiXL in Amersfoort, The Netherlands, and acts as an architect and consultant for clients in the financial and public sector. He has extensive experience with the definition and implementation of enterprise architectures, application architectures and technical architectures. In addition, he coaches organizations in setting up and executing their architecture function. Danny is responsible for the EA portal Via Nova Architectura and is a member of the governing board of the architecture department of the Dutch Computing Association. Danny is active in the architecture community, regularly publishes on IT and architecture related topics and is co-author of the book Architecture Principles: The Cornerstones of Enterprise Architecture. He can be reached at dgreefhorst@archixl.nl.

  • Chris Pehura

    I was thrown by the title of this post. I agree with this post’s perspective based on the definition of data used. In our practice we widen the definition of data to be
    1. Data in the computers
    2. Data is information that helps us plan and analyze
    3. Data is the assumptions we operate on.

    When we consider all three as the baseline definition in data, data does matter to understand what people really want and need.

Top