This paper forms the second part of a big-picture top-down data modeling trilogy. The first paper on why to take this approach can be found in two parts here (part 1 and part 2). This paper articulates “how” to develop robust data models in a timely manner and deliver sufficient quality at a speed the business can value. Instead of the business threatening data modelers to hurry up and deliver, we can engage with the business and have fun giving them what they need.
But any model is useless shelf ware if it’s not applied. One of the most common applications of these models is the design of a Data Warehouse, and this is the topic for the third member of the trilogy, specifically using Data Vault as the platform for discussion.
However, it is important to note that top-down models can deliver tangible business value across to a variety of other scenarios such as formulation of enterprise strategies, delivery of Master Data Management (MDM) solutions, design of a services architecture, and more. Those interested in just Data Vault will hopefully enjoy the entire trilogy; others are encouraged to read and apply the first two parts, and then maybe add their own paper on how they applied top-down models to their particular area of interest.
We Need Quality at Speed
The business demands that we deliver data models quickly, but they are not to be “quick-&-dirty” models. It expects that these models will not only be sufficient for now but also that they won’t become legacy technical debt over time when we work them a bit harder.
Some authors suggest the use of data model patterns might help on both fronts – quality and speed. The patterns have been consciously designed to be extensible and robust across changing demands, which is pretty important as data models are living things, needing to change in order to keep up (or even get in front) of the business. That’s why Len Silverston, one of the champions of data model patterns, calls them “universal”. They have a really good chance of not only applying to other industries, but they also have a high probability of fitting with your changing demands. And they’ve already been produced and are ready for your consumption – there’s the speed bit.
In his book, Data Model Patterns: Conventions of Thought, David Hay expresses their potential this way:
“… simpler and more generic models, … stand the test of time better, are cheaper to implement and maintain, and often cater to changes in the business not known about initially”
Another quote carrying a similar message comes from Mark Kortink in my book, The Nimble Elephant [highlighting mine]:
“… John is the best and quickest developer of powerful and practical data models that I know, and it is because of his extensive use and deep understanding of data modeling patterns. … I believe the top-down pattern-based approach John proposes is the only way to get agile with data modeling that produces robust sustainable solutions.”
If Patterns Are So Great, Where Do We Start?
For someone new to data modeling, they could start by doing an introduction-to-data-modeling course. But as I noted in the companion paper on “why” to do top-down modeling, too many training courses teach bottom-up detailed modeling by implication. Old-school, bottom-up data modeler training is a start, but it won’t get us all the way to where we want to be.
Maybe the individual charged with getting an enterprise data model out the door in record time already knows the fundamentals of data modeling. They could buy the data model patterns books of David Hay, Len Silverston, and others. I love their books and highly recommend them, but they’re reference books, not novels. Unless you’re a data model enthusiast like me, you may find reading them cover to cover about as much fun as reading a dictionary. And it will take ages to do. If what you might be better off doing is getting a feel for patterns, then going to these encyclopedic works for targeted research is a great idea. If that sounds good to you, please read on.
One Approach to Pattern-Based Delivery
Life is rich and varied. There’s plenty of methodologies that work in some cases but struggle in other situations. Having said that, below is a rough project plan that puts development of a pattern-based, big-picture enterprise data model in context. I have used this approach, with a few variations, a number of times now. As a guideline, it may be useful to you, but please note that the timeframes assume a level of familiarity on behalf of the lead modeler / facilitator. The accompanying “Patterns-D-Lite” will hopefully give you enough of an introduction to make a start, but even then I suggest you cut yourself some slack and allow more time.
Please note that the following descriptions provide an overview of one approach. If you choose to apply or adapt this approach for yourself, the supplementary material may help. I encourage you to progressively familiarize yourself with the material while you read this paper.
Week 1: Develop the Enterprise Data Model Framework
Days 1 to 3: Familiarization
If you’re a freelance, independent consultant like me, you might rock up on a new assignment and do your homework to try to get an understanding of the client, like I do. But when you turn up on day one, that’s when you get to hear what’s really happening. Even if you’re an employee acting as an internal consultant, I’d recommend you do what I do – arrange meetings with key stakeholders and listen to their views. Spend a few days acting like a sponge, being willing to set aside your preconceived ideas.
That’s the start of Week #1. Towards the end of Week #1, the fun begins.
Day 4: Introducing ‘Patterns’
Here’s the fundamental message: we want to bridge the divide between the business folk and the data professionals. So we pick a team of the visionaries from across the two areas, and work to get them on the same page.
Some people claim that non-technical people simply can’t understand a data model. In my book, The Nimble Elephant, I share how the published data model patterns of David Hay, Len Silverston, and others can be used as a bridge between business and technology. These patterns, when seen from a higher level of generalization, look at things such as events, tasks, agreements between parties, assets, and so on. If these patterns are introduced to the business folk in a non-technical manner, they “get it.”
To kick things off, I run a standard course that uses wildfire emergency response as a real-life case study. They learn enough about the patterns to gain a high-level understanding, and go through exercises to link these building blocks together for the emergency response scenario. By the way, many core elements of this course are now in your hands (in this paper plus the supplementary material)!
Day 5: Applying the ‘Patterns’
The preceding day’s work had exercises based on wildfires. Now the focus swings to their organization. I provide a user-friendly palette of matching icons for each of the foundational patterns. With a bit of facilitation, the business people can select from the palette of icons that represent their concepts, and then draw labelled lines to represent their business relationships. A simple example, based on an organization responsible for real estate title registration (buy & selling land, registering mortgages, etc.) ends up looking something like the following. An enterprise data model framework is already starting to take shape. It has their business concepts, and their interrelationships.
For people in the real estate registration business, this single-page pictorial hopefully makes sense:
- Parties and their roles include the people or organizations selling land – the vendors. Then you’re likely to have some of the following, too – the purchasers, real estate agents, solicitors, banks, guarantors and more. They are succinctly represented by the Party & Role icon.
- The Agreement icon likewise represents a rich collection of things like bank loans, loan guarantees, the agreements between people and their solicitors, the contract with the selling agent, and maybe a loan guarantee. And of course, the contract between the vendors and the purchasers.
- In the supplementary material, I go into the difference between Agreements and Documents. For now, let’s keep it simple and say that you can have an informal Agreement without supporting Documentary evidence, but for transfer of land scenarios, you’d better have managed copies (paper and electronic) of the contracts.
- The Resources / Assets, in this scenario, usually refer to the real estate (houses and land) which have Locations on a map, but Resources / Assets can also include assets such as cars used as additional security.
Just for the sake of comparison, let’s look at a framework from an emergency response organization.
There are several pattern-based building blocks in common. It’s a bit like differentials in a car. A simple 2-wheel drive vehicle will typically have one differential, but a four-wheel drive might have 2 or 3 differentials. The components are common, but they are assembled in different ways that reflect the distinct features of the car (or organization).
Does this approach work? Can non-technical people actually grapple with light-weight data modeling concepts? I remember walking in on a conversation in a kitchen at one client site. A very senior executive was sharing enthusiastically with a colleague about the enterprise data model he’d participated in developing the previous week. How’s that for teamwork across the business and the data folk!
The biggest benefit I’ve observed (in addition to ownership) for these high-level conceptual schematics is the way a common information model facilitates communication, between business people from different silos in the organization, and between business people and data people. That’s quite an outcome. And assuming the facilitator has read and absorbed the patterns from Hay and Silverston, all of that can be achieved in a few days. Not a bad investment.
Weeks 2-4: Targeted Drill-Down
After one week, the business and data folk now have an agreed framework for core data subject areas. They have a common language, with a common understanding. The value of improved understanding and precision in communication should not be underestimated. Nonetheless, the larger goal is to turn this understanding into tangible delivery for the business.
This project plan sets aside three weeks for multiple iterations. The central message is that trying to flesh out the complete enterprise model can be like trying to boil the ocean. Instead, we should only pay attention to areas most likely to deliver welcome relief to the business.
Each iteration involves two tasks: (1) Identify an area of real pain to the business, then (2) flesh out relevant details in the overall model. These two tasks are described in more detail below.
Thanks to the preceding work, the business and the data professionals now have a common language. They can participate as equals in looking at areas of the business where poor data management practices are hurting. This analysis is performed in what I call “pain point” workshops.
The initial focus is to understand the pain rather than too quickly jumping into solution mode. Solutions come soon. It reminds me of a time my lovely wife was upset. Being the sort of bloke I am, I immediately tabled a perceived solution. The response? “I don’t want a solution. I want a cuddle.” Hey, I can do that! There’s a time for solutions, and there’s a time for simply understanding the problems.
Data Model Drill-Down
Immediately following each pain point workshop, I work with the data people to flesh out the high-level schematic that had been generated in the first week, developing a more detailed model.
For each iteration, we take the published patterns and treat them as supertypes and map the business-specific concepts as subtypes. An example for a health practitioner regulation scenario follows:
The entities presented in white represent the generalized patterns as they appear in the business schematic. The pink entities are the specializations as they appear during the drill-down exercise.
It is important to note that the subtypes are represented in a multi-level hierarchy. When it comes to physical implementation, the hierarchies may be simplified / flattened (e.g. for a relational implementation, or for a Data Vault design). But for the business data model, multiple levels are fine if they aid in communication.
In addition to identifying the subtypes of the patterns, we also record discovered specialisations of inter-relationships. These can be relationships between subtypes in one hierarchy (e.g. between a Health Practitioner and a Care Recipient). They can also be relationships across domains (e.g. between an Occupational & Health Safety Incident as a subtype of Event, and the Employee from the above diagram).
Finally, important attributes are identified and added to the model.
It’s important to note that I don’t try to boil the ocean. Only those pattern stubs that need further investigation to address business needs are detailed. A “sufficient” model can be detailed in a few weeks. Benefits include:
- Facilitation of communication between people
- Facilitation of communication between computers (e.g. XML or JSON across micro services).
- Provision of a foundation for integration, be it as part of a master data management initiative, a data warehouse / Data Vault build, a service oriented architecture program, or whatever.
An example of a fully-worked enterprise data model (using class modelling notation – that’s another story) can be downloaded from the following link: http://www.countrye.com.au/principal-consultant/acme-class-model-v01-01/
Week 5: Determine the Strategy
The first 4 weeks of the plan can look somewhat similar across all sorts of projects, but this is the point where similarities often end. While some people like myself may have a lot of fun developing the framework and its targeted drill-down, we are doing this work to deliver value to the business. I keep saying “business, business, business” because IT doesn’t exist for their own pleasure. Which is sort of a pity, because many of us really enjoy our more technical roles. But reality can be harsh!
Hopefully, well before Week 1 even commenced, somebody had articulated why we are doing all of this stuff. Maybe they wanted an enterprise data model to represent the “to be” future state of an ideal data architecture. Maybe they wanted to build an enterprise data warehouse – if it’s a Data Vault, the subtypes representing business concepts may map to Data Vault hubs, and their relationships may become Data Vault links. Maybe they’re getting “Agile,” and want a fast-&-good data model to jump-start your project. The list goes on & on.
Whatever the catalyst was for getting the funding, your big-picture enterprise data model has to be put to work. I’ve been involved in all the above scenarios, and more. Yet in spite of their vast differences, I have observed a common thread.
Firstly, at this stage, there is often a need to resolve specifics of the way to move forward. If it’s an agile project, what bits of the product backlog are we going to pull down first? If it’s a Data Vault design, there may be debate about tools or aspects of the architecture (do we use Hadoop, are we getting real-time feeds or mini-batches?). And if we are aiming to develop an IT strategy of which the data architecture is but a part, there will be myriad questions to answer.
Whatever the motivation, steps may include the following:
- The business sets the priorities on which pain points identified in the previous weeks are really
- IT take responsibility for generating, evaluating, and scoring technical solution options.
- The business and IT jointly consider business priorities and technical considerations to draft a recommended way forward.
Weeks 6-8: Prove the Strategy
Before we seek funding for the preferred direction, let alone launch out in its delivery, maybe we need to prove its feasibility, to construct a proof of concept. If it’s an agile project, we may want to do a spike or two to iron out any technical wrinkles. If we’re looking at buying a package, maybe we want to do some benchmark testing of candidates.
The list goes on, but this optional step may be prudent, and may have to be tailored (and scheduled!) to fit your circumstances.
Making This Work For You
The project plan described above is to be taken as a guideline only. You are likely to find bits that suit it, and bits that need to be adapted. Below are some things you might like to consider doing.
Read ‘Data Modeling for the Business’
This paper and its associated publications are deliberately light-weight introductions to a number of topics. Hopefully you are encouraged by what is possible to achieve.
A complimentary, easy-to-read book that combines the wisdom from multiple authors is “Data Modeling for the Business” by Steve Hoberman, Donna Burbank, and Chris Bradley. I highly recommend the entire book, but of particular interest in the context of this paper is Chapter 8: ‘Creating a Successful High-Level Data Model.” It is a go-to reference, describing in detail 10 steps to delivering tangible value.
Gain Familiarity With the Data Model Patterns
As noted earlier, the timeframes for delivery I have presented assume a solid working knowledge of the data model patterns.
So first the bad news (but don’t give up at this point). While I unreservedly endorse the published data model pattern books of David Hay and Len Silverston, they are large works, together adding up to well above 2,000 pages.
Now the good news. I’ve condensed their patterns, my own, and those of others, into a jump-start kit. And it’s free. Just download my light-weight data model patterns (titled “Patterns-D-Lite”) document using the following link:
Run a Top-Down Workshop in Your Own Organization
If you recall, on Day 5 of my template approach, the intention was to run a one-day workshop with business and IT people. The key deliverable was to be a one-page pictorial of the essential concepts in the enterprise. If you wish, you can view a video where I lead some people through running a similar exercise. It’s only one hour, so it is light-weight, but it is available via Safari Books – search for “DIY Corporate Data Model: Develop your own corporate data model framework in 3 hours, using patterns.” Basically, the agenda is something like the following.
- In preparation, access ‘Patterns-D-Lite” as described above to:
- Gain familiarity with the patterns.
- Print a hard copy of the 9-Pillar Palette of core patterns as presented below:
- Grab a hard copy of the diagram of what I call the ‘Pattern of Patterns’, namely the diagram portraying the most common relationships between the 9-Pillar Patterns, again as presented below:
- Make a paper copy of the two diagrams above for each workshop participant.
- Run the workshop:
- Introduce the patterns, usually starting with ones that are easier for non-technical people to understand (parties and their roles, agreements).
- Ask participants to identify business-related subtypes for the introduced patterns e.g. types of roles and types of agreements, and hand-write these subtypes on their copy of the 9-Pillar Palette.
- Identify relationships between the subtypes, using the ‘Pattern of Patterns’ diagram as a prompter for candidate relationships.
- Repeat the above steps for remaining patterns (Event, Task …).
- Collaboratively consolidate the diagrams into one diagram, resolving naming differences.
The full-day workshop, let alone the preceding education in patterns, cannot be simply compressed into a few bullet points, but maybe this is enough to get you started. You may be pleasantly surprised at the relative ease in assembling a pattern-based perspective of your organization. And remember, because the data model patterns are solidly grounded on implementation experiences, you should feel confident in their ability to drive delivery of tangible benefits for the business – the topic of the next paper in the trilogy.