The Data Modeling Addict – July 2008

An application’s flexibility and data quality depend quite a bit on the underlying data model. In other words, a good data model can lead to a good application and a bad data model can lead
to a bad application. Therefore we need an objective way of measuring what is good or bad about the model. After reviewing hundreds of data models, I formalized the criteria I have been using into
what I call the Data Model Scorecard™.

The Scorecard contains 10 categories:

  1. How well does the model capture the requirements?
  2. How complete is the model?
  3. How structurally sound is the model?
  4. How well does the model leverage generic structures?
  5. How well does the model follow naming standards?
  6. How well has the model been arranged for readability?
  7. How good are the definitions?
  8. How well has real world context been incorporated into the model?
  9. How consistent is the model with the enterprise?
  10. How well does the metadata match the data?

This is the ninth of a series of articles on the Data Model Scorecard. The first article on the Scorecard 
summarized the 10 categories, the second article focused on the correctness category, the third article focused on the completeness category, the fourth article focused on the
structure category, the fifth article focused on the abstraction category, the sixth article focused on the standards category, the seventh
focused on the readability category, the eighth article focused on the definitions category, the ninth article focused on the real world category, and this article focuses on the consistency category. That is, How consistent is the model
with the enterprise? For more on the Scorecard, please refer to my book, Data
Modeling Made Simple: A Practical Guide for Business & IT Professionals.

How Consistent is the Model with the Enterprise?

Does this model complement the “big picture”? This question ensures information is represented in a broad and consistent context. The structures that appear in a data model should be
consistent in terminology and usage to structures that appear in related data models, and with the enterprise model if one exists. This way there will be consistency across projects. If no
enterprise model exists, I look for widely accepted existing models for comparison, ERP models if they are accessible and intelligible, or universal models which are models that are built for a
particular industry or function.

Here are a few of the red flags I look for to validate this category:

  • Synonyms. “Project XYZ calls it client, but the enterprise calls it customer.”

  • Homonyms. Homonyms are words with the same name but have different meanings. Homonyms can be very difficult to detect, as sometimes differences can be very subtle. Knowing the
    states a concept goes through can help detect and correct these situations. For example, a marketing department might use the term customer to refer to prospects, whereas the accounting
    department might only consider organizations that have already made a purchase to be considered a customer. Both Project XYZ and the enterprise call it customer, but it means two
    different things.

  • Format differences. If there is a data element on your model that is a longer length than on other models, the consequences are usually less disastrous than if your data element
    is shorter than on other data models and truncated data is possible.

As a proactive measure to improve consistency, I have found the following techniques to be very helpful:

  • Leverage an enterprise model. If one does not exist, it can take a relatively short amount of time and effort to build one at a subject-area level. You can also leverage work by industry groups
    and consortiums such as the HL7 (Health Level 7) model for healthcare or the SID (Shared Information Data Model) for communications.

  • Reuse as much as possible from similar models. This includes subject matter and common sets of data elements: subject matter meaning if you are designing a customer area, perhaps that specific
    area of customer has already been modeled and can be copied into your model; common sets of data elements meaning if there is a standard way of representing phone numbers or address, for example,
    you can copy these directly into your model.

  • Make friends. I have found those with in-depth knowledge of a specific area and those with broad business understanding to be invaluable to ensure consistency within my model.


submit to reddit

About Steve Hoberman

Steve Hoberman is a world-recognized innovator and thought-leader in the field of data modeling. He has worked as a business intelligence and data management practitioner and trainer since 1990.  Steve is known for his entertaining, interactive teaching and lecture style (watch out for flying candy!) and is a popular, frequent presenter at industry conferences, both nationally and internationally. Steve is a columnist and frequent contributor to industry publications, as well as the author of Data Modeler’s Workbench and Data Modeling Made Simple. He is the founder of the Design Challenges group and inventor of the Data Model Scorecard™. Please visit his website to learn more about his training and consulting services, and to sign up for his Design Challenges! He can be reached at