Several practitioners have contributed to this complex and elusive subject (see Dan Power’s “Five Essential Elements of MDM and CDI,” for example) and have done a good job at elaborating
the essential elements. There is one more element often overlooked in this field and remains a key differentiator and the difference between success and failure among the major initiatives I have had
the opportunity to witness firsthand – modeling the blueprint for MDM.
This is an important first step to take, assuming the business case is completed and approved. It forces us to address the very real challenges up front, before embarking on a journey that our
stakeholders must understand and support in order to succeed. Obtaining buy-in and executive support means we all share a common vision for what we are solving for. MDM is more than maintaining a
central repository of master data. The shared reference model should provide a resilient, adaptive blueprint to sustain high performance and value over time. A MDM solution should include the tools
for modeling and managing business knowledge of data in a sustainable way. This may seem like a tall order, but consider the
implications if we focus on the tactical and exclude the reality of how the business will actually adopt and embrace all of your hard work. Or worse, asking the business to stare at a blank sheet of
paper and expect them to tell you how to rationalize and manage the integrity rules connecting data across several systems, eliminate duplication and waste, and ensure an authoritative source of
clean reliable information can be audited for completeness and accuracy. Still waiting?
The following diagram illustrates where the MDM blueprint would originate from in a MDM Capability assessment of supporting tools and resources (Modeling and
Analysis).
The essential thing to remember is the MDM project is a business project that requires establishing of a common information model that
applies whatever the technical infrastructure or patterns you plan on using may be. The blueprint should remain computation and platform independent until the Operating Model is defined (and accepted
by the business), and a suitable Common Information Model (CIM) and Canonical model are completed to support and ensure the business intent. Then, and only then, are you ready to tackle the Reference
Architecture.
The Blueprint
So what is in this blueprint? The essential elements should include:
- Common Information Model
- Canonical Model
- Operating Model, and
- Reference Architecture (e.g., 4+1 views, Rozanski and Woods viewpoints and perspectives).
We will now turn our attention to first element, the Common Information Model.
A Common Information Model (CIM) is defined using relational, object, hierarchical, and semantic modeling methods. What we are really developing here is rich semantic data architecture in selected
business domains using:
- Object Oriented modeling (reusable data types, inheritance, operations for validating data)
- Relational (manage referential integrity constraints – Primary Key, Foreign Key)
- Hierarchical (Nested data types and facets for declaring behaviors on data – e.g., think XML schemas)
- Semantic models (ontologies defined through RDF, RDFS and OWL )
I believe (others may not) that MDM truly represents the intersection of Relational, Object, Hierarchical and semantic modeling methods to achieve a rich expression of the reality the organization is
operating in. Expressed in business terms, this model represents a “foundation principle” or theme we can pivot around to understand each facet in the proper context. This is not easy to
pull off, but will provide a fighting chance to resolve semantic differences in a way that help focus the business on the real matter at hand. This is especially important when the developing the
Canonical model introduced in the next step.
If you want to see what one of these looks like, visit the MDM Alliance Group (MAG). MAG is a community Pierre Bonnet
founded to share MDM modeling procedures and prebuilt data models. The MDM Alliance Group publishes a set of prebuilt data models that include the usual suspects (Location, Asset, Party, Party
Relationship, Party Role, Event, Period [Date, Time, Condition]), downloadable from the website. And some more interesting models like Classification (Taxonomy) and Thesaurus organized across three
domains. Although we may disagree about the “semantics,” I do agree with him that adopting this approach can help us avoid setting up siloed reference databases
“…unfortunately often noted when using specific functional approaches such as PIM (Product Information Management) and CDI (Customer Data Integration) modeling.” How true. And a
very common issue I encounter often.
Another good example is the CIM developed over the years at the Distributed Management Task Force (DMTF). You can get the CIM V2.20
Schema MOF, PDF and UML at their website and take a look for yourself. While this is not what most of us think of as MDM, they are solving for some of the same problems and challenges we face.
Even more interesting is what is happening in semantic technology. Building semantic models (ontologies) includes many of the same concepts found in the other modeling methods we have already
discussed, but further extends the expressive quality we often need to fully communicate intent. For example:
- Ontologies can be used at run time (queried and reasoned over)
- Relationships are first-class constructs
- Classes and attributes (properties) are set-based and dynamic
- Business rules are encoded and organized using axioms
- XML schemas are graphs not trees, and used for reasoning
If you haven’t been exposed to ontology development, I encourage you to grab the open source Protégé
Ontology Editor and discover for yourself what this all about. And while you are there see the Protégé Wiki and grab the Federal Enterprise Architecture Reference Model Ontology
(FEA-RMO) for an example of its use in the EA world. Or see the set of tools found at the Essential project. The project uses this tool to enter model content, based on a model pre-built for
Protégé. While you are at the Protégé Wiki, grab some of the ontologies developed for use with this tool for other examples, such as the SWEET Ontologies (A Semantic Web
for Earth and Environmental Terminology. Source: Jet Propulsion Laboratory). For more on this, see my post on this tool at Essential Analytics. This is an interesting and especially useful modeling
method to be aware of and an important tool to have at your disposal.
This is hard challenging work. Doing anything worthwhile usually is. A key differentiator and the difference between success and failure on your MDM journey will be taking the time to model the
blueprint and sharing this work early and often with the business.