Just the Facts

Published in TDAN.com October 2004
It is a capital mistake to theorize before one has data. Insensibly one begins to twist
facts to suit theories, instead of theories to suit facts.
Sherlock Holmes from A Scandal in Bohemia, Arthur Conan Doyle (1891)

That’s a great quote, especially if you spend your days and/or nights modeling data, and more so if you are a data analyst/modeler. Eventually all data analysts come to know that the
underlying truth in any data modeling endeavor is the data itself. However, we also come to realize that data comes in a broad range of cleanliness. All of us (data modelers that is), at one point
or another, find this out the hard way for there are no shortcuts to replacing the experience of translating a client’s vision into data/software models and taking those models all the way
into physical implementation. In a way modeling data is a bit like being a detective, gathering facts from our clients, gathering data to either support or refute the facts, and coming to a

Documenting the conclusion we have reached so that others can verify and test the line of reasoning used to arrive at the conclusion is the ultimate goal of any complete modeling approach. You must
have a solid understanding of the process of the approach you use, and in addition to those process steps there are a number of points you can use to help be successful in modeling your
client’s UoD, no matter what approach you choose to use.

In 1996 Carl Sagan wrote a book titled The Demon-Haunted World: Science as a Candle in the Dark. One of the chapters is titled The Fine Art of Baloney Detection. In it Sagan
outlines a checklist for skeptical thinking. Sagan’s checklist is used to examine ideas and provide a checklist of activities and advice to assist us in understanding a line of reasoning to
determine if it seems plausible or if it just simply falls apart. As Sagan puts it, ‘the question is not whether we like the conclusion that emerges out of a train of reasoning, but
whether the conclusion follows from the premise or starting point and whether that premise is true.’

From the perspective of Object-Role Modeling and other data analysis methodologies let’s consider each tool in the kit. I have taken the liberty of paraphrasing some of Sagan’s points
to fit the semantics of data modeling being careful, I would like to believe, not to lose the intent of each point. Sagan’s 10 points:

1) Seek out independent confirmation of the ‘facts’ whenever possible. (CSDP Step #1)

To seek out independent confirmation of the facts we must first discover the facts that are to be confirmed. Analysis of all forms of external examples in our client’s world is required to
draw out the facts that define these components in our client’s world. These examples can range from data extracts to forms to reports and many others. The most challenging situations are new
systems that don’t have a legacy to draw from. There we must rely on interviewing techniques to draw out facts and techniques to model the users external view of their UoD to provide a way to
view those facts in a concrete way. In all of these cases, it is necessary confirm the collected facts, and the sources that represent them, using another person or system that can be used to
support or refute the facts.
(Where applicable a point is related to a process step in Terry Halpin’s Conceptual Schema Design Process [CSDP] . {‘Information Modeling and Relational Databases, from Conceptual
Analysis to Logical Design’, Terry Halpin, Morgan Kaufman Publishers, 2001})

2) Encourage substantive debate on the evidence by knowledgeable proponents from all points of view.

A data analyst must get to the widest possible cross section of the business that is reasonable. Involving as diverse a base of business users as is possible will in the long run help your client
achieve a solution that is acceptable. However, you can overdo this, so practice due diligence but don’t go overboard. Three benefits of a good methodology should be:
  • Organizing ideas from a wide variety of sources
  • Reaching consensus among a wide range of people’s opinions and views
  • Providing a way to focus on particular subject areas within a complex UoD.

3) Be fair to the process and treat each equally as a subject matter expert in their area of knowledge – avoid putting people in the position of being an authority on a

Seek out business people who are confident of their knowledge in their subject area. Warning lights should go off when someone tells you that they are the only authority in a subject area (remember
#1). Simply be as cooperative as you can, collect their contributions with appreciation, and then quietly find a way to verify the information.

4) Spin more than one way of looking at your UoD – your first cut may be only a partial solution.

There is more than one way to model a UoD. As analysts we need to devise a model that fits a particular situation with a minimum of complexity that can completely service the software model. Doing
this can be a challenge at times, however the more patterns you create and expose yourself to the more confident you will become in recognizing alternative approaches to addressing the
client’s vision and goals.

5) Seek out others for critical feedback – don’t get overly attached to one solution.

As analysts we must stay as objective as possible when addressing a particular model and seek out review of models with others. We do this early in the process can help us avoid going down a path
that may not be as useful. Demand that your reviewers play the role of a devil’s advocate and have them look for ways of rejecting the model being reviewed. This will strengthen the model and
test it in ways that you may not have thought of. Besides, if the model fails, it will be a temporary set back and ultimately the model will be the better for it.
If you are having a particularly tough time with a model, put it away for a while or go work on another subject area and then come back to the problem later.

6) Populate – gather example data for the facts. (CSDP Step #2)

Populate the facts that were discovered using the data provided by the external examples and interviews. This of course is the heart and soul of Object-Role Modeling (ORM). Without example data,
all you have is a pile of related opinions. This is certainly a good starting point, however ORM provides a clear path to take the model much further. Fact populations are used to extend the
model’s meaning with constraints, provide a way to check the correctness of model, test the model with process scenarios, and lend credibility to model. Of equal importance, it allows the
business user to ‘experiment’ with the model and test out their own set of data to determine if the facts hold up.

7) If there is a chain of argument in the model, then every link in the chain must work.

As part of the process of modeling the data, an analyst should be gathering a first cut of the software model requirements. There are a number of forms this can take such as, Use Case Diagrams,
Dynamic Models, Process Maps, Data Flow Diagrams, Sequence Diagrams, etc. All are valid, the important thing is to use something to capture the software requirements. The goal is to use the process
scenarios against the object role model with example data to test a particular chain of facts involved in the process scenario. You must be able to process the entire scenario without any
perceivable gaps, such as a missing fact to support a process or after all process scenarios have been consider, leftover facts that were not used by any of the process scenarios. If there are some
gaps then either the process scenario needs to be adjusted and/or the data model.

8) Experience urges us to use the simpler of two models when each equally models the data.

We can model a UoD and perhaps transform it to a simpler model. There have been a number of papers written about model transformations and creating separate models that represent the data equally.
Choose the simpler model if your client can verify that the model has an acceptable degree of stability for the foreseeable future. Here we have to rely on the business user’s experience and
their understanding of the business’s strategic goals to guide everyone in making the best choice. Use these strategies to help drive the construction of key patterns in your model to be as
flexible as possible. Accomplish this without getting so abstract that it makes the model too complex to work.

9) Ask how the examples can be falsified, in other words purposely break the constraints derived from fact populations to test them.

Using acceptable fact populations to give an ORM fact its complete usefulness is needed to derive constraints placed on those populations. However, as equally important are unacceptable populations
because those examples help define the boundaries of what is acceptable. Creating examples that conflict with what the business user would allow can help solidify their understanding of a fact and
in some situations can serve as a pointer to missing roles.

10) Can others duplicate and understand the line of reasoning and come to the same conclusion?

Using an approach where we can test various process scenarios, with data, against a model to prove or disprove the validity of the model is enough to justify its use. Specifically, Object Role
Modeling is much more than just simply documenting a system, it is the manner in which it arrives at a particular model of the client’s vision that lends the conclusion credibility.

In ORM, the key to understanding the user’s UoD is through populating the facts. In a sense this is the manner in which ORM runs small ‘experiments’ against the model starting
with a fact. Larger ‘experiments’ can be run against collections of facts to test the validity of the model and determine the correctness of the more complex constraints and rules.
Facts (objects playing roles) and the data that populates them, are the heart and soul of ORM. This is what makes ORM such a comprehensive approach to data modeling. Sagan’s checklist
certainly fits well with the goals of ORM and can serve as helpful advice for any data modeler.

For more information about Sagan’s Baloney Detection Kit, you can visit this web site: www1.tpgi.com.au/users/tps-seti/baloney.html and find a copy of Carl Sagan’s 1996 book titled The Demon-Haunted World: Science as
a Candle in the Dark
(ISBN- 0-345-40946-9).

Previously Published in the Journal of Conceptual Modeling – www.inconcept.com/jcm


submit to reddit

About Dick Barden

Dick is a Senior Partner and Principal Consultant for InConcept. He has over 15 years of ORM/NIAM experience and is  a certified ORM consultant and trainer and a certified Visio trainer.

Contact Information:
Dick Barden
Vice President and Co-Founder
8171 Hidden Bay Trail N
Lake Elmo, MN 55042
(651) 777-8484
fax: (651) 777-9634