The Data Modeling Addict – October 2007

An application’s flexibility and data quality depend quite a bit on the underlying data model. In other words, a good data model can lead to a good application, and a bad data model can lead
to a bad application. Therefore, we need an objective way of measuring what is good or bad about the model. After reviewing hundreds of data models, I formalized the criteria I have been using into
what I call the Data Model Scorecard.

The Scorecard contains 10 categories:

  1. How well does the model capture the requirements?
  2. How complete is the model?
  3. How structurally sound is the model?
  4. How well does the model leverage generic structures?
  5. How well does the model follow naming standards?
  6. How well has the model been arranged for readability?
  7. How good are the definitions?
  8. How well has real world context been incorporated into the model?
  9. How consistent is the model with the enterprise?
  10. How well does the metadata match the data?

This is the seventh of a series of articles on the Data Model Scorecard. The first article on the
Scorecard  summarized the 10 categories, the second article focused on the correctness category, the third article focused on the completeness category, the fourth article focused on the structure category, the fifth article focused on the abstraction category,
the sixth article focused on the standards category, and this article focuses on the readability category. In other
word, how well has the model been arranged for readability? For more on the Scorecard, please refer to my book, Data Modeling Made Simple: A Practical Guide for Business & IT

How well has the model been arranged for readability?

This question checks to make sure the model is visually easy to follow. This question is definitely the least important category. However, if your model is hard to read, you may not accurately
address the more important categories on the scorecard. Readability needs to be considered at a model, entity, data element, and relationship level.

At a model level, I like to see a large model broken into smaller logical pieces. I also search for the “heart” of the model. That is, upon looking at a data model, which part of the
model are your eyes naturally attracted toward? This tends to be an entity or entities with many relationships to other entities, similar to the hub on a bicycle tire with many spokes to the
outside rim of the tire. This “heart” needs to be carefully positioned so that the reader can identify it early on and use this as a starting point to walk through the rest of the

At an entity level, I like to see child entities below parent entities. Child entities being on the many side of the relationship, and parent entities on the one side of the relationship. So if an
order contains many order lines, order line should appear below order.

At a data element level, I like to see some logic applied regarding the placement of data elements within an entity. For example, on reference entities such as customer and
employee, it is more readable to have the data elements listed in an order that makes sense for someone starting at the beginning of the entity and working down sequentially. For example,
city name, then state name, then postal code. For transaction entities such as order and claim, I have found it more readable to group the data elements into
class words, such as all amount data elements grouped together.

At a relationship level, I avoid crossing relationship lines or lines going through unrelated entities. I also look for missing or incomplete relationship labels, if labels are appropriate on the
model. The larger the model, the less useful it is to display labels as the extra verbiage can make the model harder to read.

Here are a few of the violations I have found in the readability category:

  • Child entities above Parent entities.
  • No logic or poor logic in data element order (for example, data elements alphabetized within an entity, so that customer middle initial appears after customer last

  • Relationship lines crossing each other or through unrelated entities.
  • Difficulty finding the “heart” of the model.

As a proactive measure to improve the readability of the data model, I have found the most important technique is to actually put yourself in your audience’s shoes. That is, if you were the
person who needs to completely understand the model, how would you like to see it arranged to make it as easy to read as possible?



submit to reddit

About Steve Hoberman

Steve Hoberman is a world-recognized innovator and thought-leader in the field of data modeling. He has worked as a business intelligence and data management practitioner and trainer since 1990.  Steve is known for his entertaining, interactive teaching and lecture style (watch out for flying candy!) and is a popular, frequent presenter at industry conferences, both nationally and internationally. Steve is a columnist and frequent contributor to industry publications, as well as the author of Data Modeler’s Workbench and Data Modeling Made Simple. He is the founder of the Design Challenges group and inventor of the Data Model Scorecard™. Please visit his website to learn more about his training and consulting services, and to sign up for his Design Challenges! He can be reached at