facts to suit theories, instead of theories to suit facts.
Sherlock Holmes from A Scandal in Bohemia, Arthur Conan Doyle (1891)
That’s a great quote, especially if you spend your days and/or nights modeling data, and more so if you are a data analyst/modeler. Eventually all data analysts come to know that the
underlying truth in any data modeling endeavor is the data itself. However, we also come to realize that data comes in a broad range of cleanliness. All of us (data modelers that is), at one point
or another, find this out the hard way for there are no shortcuts to replacing the experience of translating a client’s vision into data/software models and taking those models all the way
into physical implementation. In a way modeling data is a bit like being a detective, gathering facts from our clients, gathering data to either support or refute the facts, and coming to a
Documenting the conclusion we have reached so that others can verify and test the line of reasoning used to arrive at the conclusion is the ultimate goal of any complete modeling approach. You must
have a solid understanding of the process of the approach you use, and in addition to those process steps there are a number of points you can use to help be successful in modeling your
client’s UoD, no matter what approach you choose to use.
In 1996 Carl Sagan wrote a book titled The Demon-Haunted World: Science as a Candle in the Dark. One of the chapters is titled The Fine Art of Baloney Detection. In it Sagan
outlines a checklist for skeptical thinking. Sagan’s checklist is used to examine ideas and provide a checklist of activities and advice to assist us in understanding a line of reasoning to
determine if it seems plausible or if it just simply falls apart. As Sagan puts it, ‘the question is not whether we like the conclusion that emerges out of a train of reasoning, but
whether the conclusion follows from the premise or starting point and whether that premise is true.’
From the perspective of Object-Role Modeling and other data analysis methodologies let’s consider each tool in the kit. I have taken the liberty of paraphrasing some of Sagan’s points
to fit the semantics of data modeling being careful, I would like to believe, not to lose the intent of each point. Sagan’s 10 points:
1) Seek out independent confirmation of the ‘facts’ whenever possible. (CSDP Step #1)
draw out the facts that define these components in our client’s world. These examples can range from data extracts to forms to reports and many others. The most challenging situations are new
systems that don’t have a legacy to draw from. There we must rely on interviewing techniques to draw out facts and techniques to model the users external view of their UoD to provide a way to
view those facts in a concrete way. In all of these cases, it is necessary confirm the collected facts, and the sources that represent them, using another person or system that can be used to
support or refute the facts.
Analysis to Logical Design’, Terry Halpin, Morgan Kaufman Publishers, 2001})
2) Encourage substantive debate on the evidence by knowledgeable proponents from all points of view.
achieve a solution that is acceptable. However, you can overdo this, so practice due diligence but don’t go overboard. Three benefits of a good methodology should be:
- Organizing ideas from a wide variety of sources
- Reaching consensus among a wide range of people’s opinions and views
- Providing a way to focus on particular subject areas within a complex UoD.
3) Be fair to the process and treat each equally as a subject matter expert in their area of knowledge – avoid putting people in the position of being an authority on a
#1). Simply be as cooperative as you can, collect their contributions with appreciation, and then quietly find a way to verify the information.
4) Spin more than one way of looking at your UoD – your first cut may be only a partial solution.
this can be a challenge at times, however the more patterns you create and expose yourself to the more confident you will become in recognizing alternative approaches to addressing the
client’s vision and goals.
5) Seek out others for critical feedback – don’t get overly attached to one solution.
that may not be as useful. Demand that your reviewers play the role of a devil’s advocate and have them look for ways of rejecting the model being reviewed. This will strengthen the model and
test it in ways that you may not have thought of. Besides, if the model fails, it will be a temporary set back and ultimately the model will be the better for it.
6) Populate – gather example data for the facts. (CSDP Step #2)
all you have is a pile of related opinions. This is certainly a good starting point, however ORM provides a clear path to take the model much further. Fact populations are used to extend the
model’s meaning with constraints, provide a way to check the correctness of model, test the model with process scenarios, and lend credibility to model. Of equal importance, it allows the
business user to ‘experiment’ with the model and test out their own set of data to determine if the facts hold up.
7) If there is a chain of argument in the model, then every link in the chain must work.
Dynamic Models, Process Maps, Data Flow Diagrams, Sequence Diagrams, etc. All are valid, the important thing is to use something to capture the software requirements. The goal is to use the process
scenarios against the object role model with example data to test a particular chain of facts involved in the process scenario. You must be able to process the entire scenario without any
perceivable gaps, such as a missing fact to support a process or after all process scenarios have been consider, leftover facts that were not used by any of the process scenarios. If there are some
gaps then either the process scenario needs to be adjusted and/or the data model.
8) Experience urges us to use the simpler of two models when each equally models the data.
Choose the simpler model if your client can verify that the model has an acceptable degree of stability for the foreseeable future. Here we have to rely on the business user’s experience and
their understanding of the business’s strategic goals to guide everyone in making the best choice. Use these strategies to help drive the construction of key patterns in your model to be as
flexible as possible. Accomplish this without getting so abstract that it makes the model too complex to work.
9) Ask how the examples can be falsified, in other words purposely break the constraints derived from fact populations to test them.
because those examples help define the boundaries of what is acceptable. Creating examples that conflict with what the business user would allow can help solidify their understanding of a fact and
in some situations can serve as a pointer to missing roles.
10) Can others duplicate and understand the line of reasoning and come to the same conclusion?
Modeling is much more than just simply documenting a system, it is the manner in which it arrives at a particular model of the client’s vision that lends the conclusion credibility.
In ORM, the key to understanding the user’s UoD is through populating the facts. In a sense this is the manner in which ORM runs small ‘experiments’ against the model starting
with a fact. Larger ‘experiments’ can be run against collections of facts to test the validity of the model and determine the correctness of the more complex constraints and rules.
Facts (objects playing roles) and the data that populates them, are the heart and soul of ORM. This is what makes ORM such a comprehensive approach to data modeling. Sagan’s checklist
certainly fits well with the goals of ORM and can serve as helpful advice for any data modeler.
For more information about Sagan’s Baloney Detection Kit, you can visit this web site: www1.tpgi.com.au/users/tps-seti/baloney.html and find a copy of Carl Sagan’s 1996 book titled The Demon-Haunted World: Science as
a Candle in the Dark (ISBN- 0-345-40946-9).
Previously Published in the Journal of Conceptual Modeling – www.inconcept.com/jcm