Authors: Graeme C. Simsion, Graham C. Witt
Publisher: Morgan Kaufmann Publishers, Third Edition 2005
ISBN 0126445516
“Data modeling is not optional; no database was ever built without at least an implicit model, just as no house was ever built without a plan.” I’ve used this quote from Data Modeling
Essentials in a number of my presentations, as a powerful analogy on the value of deliberate modeling that resonates with my audiences.
Data Modeling Essentials (DME) has recently been updated to a 3rd edition, in a collaborative effort between Graeme Simsion and Graham Witt. These 2 Australians have extensive practical experience
in data modeling, and their first-hand experiences are a valuable means of illustrating data modeling concepts.
While DME is rich with information for the novice modeler, it is worthwhile reading for experienced modelers as well. It covers the gamut from basic modeling concepts to advanced topics. It starts
with some concrete benefits of data modeling, and a list of criteria for assessing data model quality. Depending on how good you are at answering impromptu questions, it will provide some new ideas
for the next time someone asks, “Why do we need a data model?”, or “Why do you think this model is better than that one?”. And identifying which items from the list of
modeling criteria are most relevant to a given modeling assignment will guide your design choices and facilitate decisions when choosing between several candidate models.
DME also raises a philosophical question that you may never have considered: is data modeling analysis or design? The authors argue that data modeling is not just analysis. Some take the stance
that if the requirements are understood, there’s only one right way to model them. But the authors of DME claim there’s more than one way to model a set of requirements, and that these choices
have ramifications in clarity of communication, elegance of design, etc.
These aren’t concepts I was exposed to when I first learned data modeling as a set of dry, mechanical, activities. My early training would have been richer for learning about creativity and choice
in modeling!
After setting the stage for data modeling, the first part of DME continues with fundamental modeling concepts, such as normalization, diagramming, supertypes/subtypes and primary keys, in terms of
what they are, why they are applied, and how they impact data model design.
For example, “There is no area of data modeling in which mistakes are more frequently made, and with more impact, than the specification of primary keys.” When reading the chapter on
primary keys, I learned more than I expected, such as factors to consider when choosing between surrogate and natural keys. By understanding the ramifications behind each choice, it’s not only
easier to communicate the reasons behind decisions, but it leads to better-informed decisions!
DME Part I closes with an overview of modeling extensions and alternatives, with specific references to the Chen E-R approach, UML (Unified Modeling Language) and ORM (Object Role Modeling).
DME Part II “Pulls it all together”. It addresses the organization of the data modeling task and then elaborates on the key tasks, from business requirements gathering through to
physical database design. It includes task deliverables and tips for dealing with alternative scenarios such as reverse-engineering existing databases (helpful, given how frequently we now deal
with 3rd-party packages!).
DME has many techniques for gathering and organizing business requirements, then transforming the requirements into the DME conceptual model. I appreciated their suggestions for evaluating the
conceptual data model, from the classic model walkthrough through to a systematic and thorough technique of translating the data model into a set of assertions framed as English sentences, with
which the business analyst can agree or disagree. This technique is the most certain way to ensure that no implications of the data model have been misunderstood or overlooked!
From the conceptual data model, DME steps you through the transformation to a logical database model that could be implemented in a generic DBMS, and further transformations that could be applied
to produce a physical database model, such as for performance enhancement. But again, there are choices, including performance-tuning options that should be explored before any logical model
compromises are considered.
Part III addresses advanced topics. These include advanced normalization, modeling time-dependent data (including audit trails, snapshots and time-dependent relationships), business rules modeling
(including your choices in modeling business rules), data warehousing (highlighting differences between the dimensional model and the E-R model) and enterprise data modeling. The book ends with a
list of suggested reading, broken out by chapter.
This is the first book I would recommend to anyone who wants to learn more about data modeling, whether a novice or an experienced modeler. They will come away from this book with a stronger
modeling skill set, and a better appreciation of the choices they have and the implications of these choices. And with the authors’ personable tone, accessible explanations, and relevant examples
based on practical experience, they’ll have fully enjoyed the experience!