|
The 10-Minute Data Model Review, Part 1
Published: July 1, 2008 Data modeling is based on the precept of knowing your data and knowing how the data is interrelated with other data. Everyone knows their data – right? If so, then why do we have so many
problems building systems that do what the users want?
In this paper, Part 1 of a series, I will show you some of the techniques you can use to quickly review a data model for the most common mistakes made by data modelers. The list of mistakes is by no means exhaustive – it’s a list of mistakes that I have identified over the course of 20 years, working with and reviewing models developed by everyone from novices to experts. Throughout this paper, I use the term project to define the scope of work – a project can be the development of a stand-alone system, the development of an integrated set of systems, a departmental system, or the development of entire enterprise’s model. While your specific application of the word project will vary, the items described below will apply regardless of the scope. 1. Not understanding the purpose of logical data modeling
2. Building the model with a specific technology in mind A logical model is just that – it is a model that describes the logical (non physical) data and data relationships used by the enterprise without considering physical aspects of either the current or anticipated implementation.
3. Building a model without useful entity and attribute definitions A logical data model consists of two parts – the diagram that is hung on the wall to intimidate the uninitiated and the definitions that define what everything means. A failure to develop robust, complete, non-overlapping definitions will result in a pretty picture, but no tangible use for it. One of the most difficult aspects of data modeling is the development of good definitions; one of the first entities I look at when performing a model review is “Employee.” Typically, the definition I see is “Someone who works here.” At that point, the following conversation usually occurs:
While the definition that resulted from that conversation may not fit your specific organization, you should be able to see the type of review and discussion that must occur as part of the definition process.
4. Lack of understanding of cardinality and its implications Simply put, cardinality is identifying the numerical relationship between occurrences of item A and occurrences of item B; or, more simply, knowing how many B’s are required for each A. The cardinality embedded in a data model documents the business rules about relationships between data in a manner that is relevant to the scope. For example, some organizations have a 1:1 relationship between an employee and a desk. Each employee has one, and only one desk; each desk has one and only one employee. This is radically different from an organization with a pool of desks available for anyone in the office to use and is also radically different from an organization where a single employee may have multiple desks (usually at multiple locations). There is no correct answer; it is a business-specific model. However, the implication of specifying the incorrect cardinality is significant. Using a different example, if you wanted to model the relationship between an employee and child, most people would show that an employee could have 0, 1 or many children, but each child could only be associated with a single employee. The implication of this cardinality is that both parents of a child could not work for that company. A situation that could force an employee to quit – or to not have children if their partner worked for the company!! There are multiple methods data modelers use to ensure they have “covered all the bases”:
5. Not enforcing referential integrity Referential integrity is enforced in the data model through the use of foreign keys, ensuring that the required keys contained in the child entity are already contained in the parent entity. The implication of the dependency are quite simple. The child entity can’t be added unless the parent entity is already present, and if the parent entity is deleted, the child entity must also be deleted.
Part 2 of this series will discuss five more techniques you can use to quickly review a data model. Go to Current Issue | Go to Issue Archive Recent articles by Scott Benson
Scott Benson - Scott has been training and consulting in data and process modeling, enterprise architecture and strategic planning for 20 years, for both commercial and government clients. He has published articles
on data modeling and enterprise architecture as well as numerous other topics and has spoken at many conferences. He is a Senior Information Architect for VMD Systems Integrators, Inc., where he can be contacted by e-mail at sbenson@vmdsys.com or at (703) 288-3100.
|