An application’s flexibility and data quality depend quite a bit on the underlying data model. In other words, a good data model can lead to a good application, and a bad data model can lead
to a bad application. Therefore, we need an objective way of measuring what is good or bad about the model. After reviewing hundreds of data models, I formalized the criteria I have been using into
what I call the Data Model Scorecard.
The Scorecard contains 10 categories:
- How well does the model capture the requirements?
- How complete is the model?
- How structurally sound is the model?
- How well does the model leverage generic structures?
- How well does the model follow naming standards?
- How well has the model been arranged for readability?
- How good are the definitions?
- How well has real world context been incorporated into the model?
- How consistent is the model with the enterprise?
- How well does the metadata match the data?
This is the sixth of a series of articles on the Data Model Scorecard. The first article on the Scorecard summarized the 10 categories, the second article focused on the
correctness category, the third article focused on the completeness category, the fourth article focused on the structure category, the fifth article focused on the abstraction category, and this
article focuses on the standards category. That is, How well does the model follow naming standards? For more on the Scorecard, please refer to my book, Data Modeling Made Simple: A Practical Guide for Business & IT Professionals.
How well does the model follow naming standards?
Correct and consistent naming standards are extremely helpful for knowledge transfer and integration. New team members who are familiar with similar naming conventions on other projects will avoid
the time to learn a new set of naming standards. Efforts to bring together information from multiple systems will be less painful if the data elements are named consistently across projects. This
category focuses on naming standard structure, abbreviations, and syntax.
Structure includes the components of a name. A popular standard for data element structure is one Subject Area; zero, one, or many Modifiers; and one Class Word. A subject area is a concept that is
basic and critical to the business. A modifier qualifies this subject area, and a class word is the high-level domain for a data element. Examples of class words are Quantity, Amount, Code and
Date. Enforcing a naming standard on Gross Sales, for example, would require us to accurately identify the class word, such as Gross Sales Amount. Enforcing a naming standard on Name would require
us to accurately identify the subject area and modifiers, such as customer last name.
An abbreviations list should be used to name each logical and physical term. Organizations should have a process in place for efficiently creating new abbreviations if a term cannot be found on a
list. The process should be carefully managed to prevent different abbreviations being created for the same or similar term, and for creating the same abbreviation for completely different terms.
Syntax includes whether the term should be plural or singular, whether hyphens, spaces or camelback (i.e., initial upper case such as CustomerLastName) should be used; and case includes whether the
terms should be all upper case, initial upper case or all lower case.
Here are a few of the red flags I look for to validate this category:
- Entity names that sound more like data elements, such as Customer Type Code.
- Data element names that don’t follow the structure one Subject Area; zero, one, or many Modifiers; and one Class Word such as Gross Sales and Name.
- Inconsistent syntax, such as Customer Last Name, CUSTOMER FIRST NAME and customer_middle_initial_name.
As a proactive measure to improve the naming standards on the data model, I have found the following techniques to be very helpful:
Publish your naming standards clearly and succinctly in a format that is easily accessible. Ideally, keep your naming standards to a maximum of a few pages
and make it very easy to read and user-friendly. Also publish it on websites, shared drives, and so on, to make it as easy as possible for people to reference.
Look for opportunities to automate as much of the abbreviations process as possible. For example, there are some software tools that will let you import an abbreviations list and
then automatically apply the abbreviations to data element and entity names. This can save much time and reduce human errors.
Apply naming conventions as early as possible in the life cycle. The earlier you apply or validate naming standards, the greater the chance the design and development teams will
be more receptive to make the changes. After there is code written against the data elements, it can be much more work to change the data element names.
Have rules about when to “grandfather” older names. When I notice data element names that don’t follow standards, it is possible that different or older naming
standards might have been used. In situations like this, I prefer to have consistent and “wrong” names rather than inconsistency with some percentage of the names following current
naming standards. For example, I would rather have CUST_LAST_NAME appear consistently throughout the model, instead of having CUST_LAST_NAME and CUST_LST_NAM appear within the same model.