I wear several hats. One is the “Steve the Data Modeler” hat and another is the “Steve the Publisher” hat. I manage Technics Publications (www.TechnicsPub.com), a publishing company specializing in data management and business texts. We currently have over 60 titles, and on average publish about six a year. For this issue of TDAN, I am going to talk about my latest book, Data Model Scorecard: Applying the Industry Standard on Data Model Quality.
This book provides a short data modeling primer, and then dives into how to review a data model. Learn how to apply the Data Model Scorecard to identify strengths and areas for improvement in your data models. This book is written for people who build, use, or review data models. There are three sections:
In Section I, Data Modeling and the Need for Validation, the reader will receive a short data modeling primer in Chapter 1, understand why it is important to get the data model right in Chapter 2, and learn about the Data Model Scorecard in Chapter 3.
As part of the data modeling primer in Chapter 1, a succinct explanation is provided for both “data modeling” and “data model”:
Data modeling is the process of discovering, analyzing, and scoping data requirements and then representing these data requirements in a visual format called the “data model.” A data model is a set of symbols and text used for communicating a precise representation of an information landscape. As with a model of any landscape, such as a map that models a geographic landscape, certain content is included and certain content excluded to facilitate understanding.
“Discovering” involves determining what information the business needs in its business processes and/or applications such as learning that Customer and Account are important concepts. “Analyzing” involves clarifying requirements, such as coming up with clear definitions for Customer and Account and understanding the relationship between customers and their accounts. “Scoping” involves working with the business to determine what is most important for a particular project phase such as whether we need both Savings and Checking Accounts, or just Checking Accounts for Phase 1. “Representing” means displaying what the information landscape looks like using an unambiguous precise language such as in the following data model:
- Each Customer may own one or many Accounts.
- Each Account must be owned by one or many Customers.
Data models are the main medium used to communicate data requirements from business to IT, and within IT from analysts, modelers, and architects to database designers and developers. Regardless of whether the underlying database technology is a Relational Database Management System (RDBMS) such as Oracle or Teradata, or a NoSQL database such as MongoDB or Hadoop, we still need a way to communicate data requirements. Therefore, we need data models!
Our data models need to be of high quality to support current requirements yet also gracefully accommodate future requirements. The Data Model Scorecard® is a tool you can use to improve the quality of your organization’s data models.
In Chapter 3, the Data Model Scorecard template is introduced:
|#||Category||Total score||Model score||%||Comments|
|1||How well does the model capture the requirements?||15|
|2||How complete is the model?||15|
|3||How well does the model match its scheme?||10|
|4||How structurally sound is the model?||15|
|5||How well does the model leverage generic structures?||10|
|6||How well does the model follow naming standards?||5|
|7||How well has the model been arranged for readability?||5|
|8||How good are the definitions?||10|
|9||How consistent is the model with the enterprise?||5|
|10||How well does the metadata match the data?||10|
In Section II, Data Model Scorecard Categories, each of the ten categories of the Data Model Scorecard is discussed. There are ten chapters in this section, each chapter dedicated to a specific Scorecard category:
- Chapter 4: Correctness
- Chapter 5: Completeness
- Chapter 6: Scheme
- Chapter 7: Structure
- Chapter 8: Abstraction
- Chapter 9: Standards
- Chapter 10: Readability
- Chapter 11: Definitions
- Chapter 12: Consistency
- Chapter 13: Data
Each of these chapters ends with a summary of that category’s checks.
In Section III, Validating Data Models, tips are provided to the reader to prepare for the model review (Chapter 14), then during the model review (Chapter 15), and then all ten categories are brought together in a final chapter where a data model is reviewed based upon an actual project (Chapter 16). Here is the data model that is reviewed in Chapter 16:
What score would you give this model? I’ll give you a hint – it will not be 100% (or even 90%). Each category is applied to this model and strengths and areas for improvement are documented. Quite a few areas for improvement on this model!
I am looking forward to sharing more about our books with you in our next TDAN column!