The Data Modeling Addict – April 2006

Published in TDAN.com April 2006

An application’s flexibility and data quality depend quite a bit on the underlying data model. In other words, a good data model can lead to a good application and a bad data model can lead to a
bad application. Therefore we need an objective way of measuring what is good or bad about the model. After reviewing hundreds of data models, I formalized the criteria I have been using into what
I call the Data Model Scorecard.

The Scorecard contains 10 categories:

  1. How well does the model capture the requirements?
  2. How complete is the model?
  3. How structurally sound is the model?
  4. How well does the model leverage generic structures?
  5. How well does the model follow naming standards?
  6. How well has the model been arranged for readability?
  7. How good are the definitions?
  8. How well has real world context been incorporated into the model?
  9. How consistent is the model with the enterprise?
  10. How well does the metadata match the data?

This is the first of a series of articles on the Data Model Scorecard. This article will summarize the 10 categories, and each subsequent article will focus in on a single category. For more on the
Scorecard, please refer to the book, Data Modeling Made Simple: A Practical Guide for Business & IT Professionals. Here is a summary of each of the 10 categories:

  1. How well does the model capture the requirements? This question ensures we understand the content of what is being modeled. This can be the most difficult of all 10 categories to grade, the
    reason being that we really need to understand how the business works and what the business wants from their application. If we are modeling a sales data mart, for example, we need to understand
    both how the invoicing process works in our company, as well as what reports and queries will be needed to answer key sales questions from the business.
  2. How complete is the model? This question ensures the scope of the model matches the scope of the requirements. A model containing more than the written requirements can lead to scope creep. A
    model containing less than the written requirements could lead to application functionality gaps.
  3. How structurally sound is the model? This question ensures that the model follows good design principles. Many of the potential problems from this category are quickly and automatically flagged
    by our modeling and database tools. Examples of structural soundness violations are circular relationships, two data elements with the same exact name in the same entity, a null data element in a
    candidate key, and certain reserved words in data element and entity names.
  4. How well does the model leverage generic structures? This question ensures the correct level of abstraction is applied on the model. One of the most powerful tools a data modeler has at their
    disposal is abstraction, the ability to increase the types of information a design can accommodate using generic structures. Going from Customer Location to a more generic Location for example,
    allows the design to more easily handle other types of locations, such as warehouses and distribution centers. Under-abstracting can lead to additional maintenance, over-abstracting can lead to
    additional confusion.
  5. How well does the model follow naming standards? This question ensures correct and consistent naming standards are applied, which are extremely helpful for knowledge transfer and integration.
    New team members who are familiar with similar naming conventions on other projects will avoid the time to learn a new set of naming standards. Efforts to bring together information from multiple
    systems will be less painful if the data elements are named consistently across projects. This category focuses on naming standard structure, abbreviations, and syntax.
  6. How well has the model been arranged for readability? This question checks to make sure the model is visually easy to follow. This question is definitely the least important category. However,
    if your entities, data elements, and relationships are difficult to read you may not accurately address the more important categories on the Scorecard.
  7. How good are the definitions? This question makes sure definitions are clear, complete, and correct. Clear so that a reader can understand the meaning of a term by reading the definition only
    once. Complete meaning at the appropriate level of detail and that it includes all the necessary components such as derivations and examples. Correct meaning a definition that totally matches what
    the term means, and is consistent with the rest of the business.
  8. How well has real world context been incorporated into the model? This is the only question on the Scorecard that pertains to just one type of data model: the physical data model. Here we
    consider important factors to an application’s success, such as navigation requirements, response time, storage space, backup and recovery, and security.
  9. How consistent is the model with the enterprise? This question ensures that the model complements the “big picture” and is represented in a broad and consistent context. The structures that
    appear in a data model should be consistent in terminology and usage to similar structures in other data models, and with the enterprise data model if one exists. This way there will be a common
    look and feel across projects.
  10. How well does the meta data match the data? This question ensures the model and the actual data that will be stored within the resulting tables are consistent with each other. This category
    determines how well the data elements and their rules match reality. This might be very difficult to do early in a project’s life cycle, but the earlier the better so you can avoid future
    surprises which can be much more costly.

The Scorecard is relatively easy to apply and standardize. What can be challenging (but necessary) is to incorporate the Scorecard into your methodology as a final checkpoint before the model is
considered complete.

Portions of this article were originally published in the January 26, 2006, issue of the TDWI e-newsletter TDWI FlashPoint. Reprinted with permission. For more information about TDWI, please visit
www.tdwi.org.

Share

submit to reddit

About Steve Hoberman

Steve Hoberman is a world-recognized innovator and thought-leader in the field of data modeling. He has worked as a business intelligence and data management practitioner and trainer since 1990.  Steve is known for his entertaining, interactive teaching and lecture style (watch out for flying candy!) and is a popular, frequent presenter at industry conferences, both nationally and internationally. Steve is a columnist and frequent contributor to industry publications, as well as the author of Data Modeler’s Workbench and Data Modeling Made Simple. He is the founder of the Design Challenges group and inventor of the Data Model Scorecard™. Please visit his website www.stevehoberman.com to learn more about his training and consulting services, and to sign up for his Design Challenges! He can be reached at me@stevehoberman.com.

Top