Published in TDAN.com January 2001
Introduction
In technical literature and in the as various discussion groups, there seems to be a lot of confusion as to what is the “conceptual” level, what is the “logical” level, and what is the
“physical” level. It is the author’s opinion that the confusion lies in shifting meanings of the aforementioned terms. The root of this confusion illustrates the need to discuss a model in terms
of two separate and orthogonal categorizations: perspective and abstraction.
This paper will attempt to define these categorizations, define what their levels (sub-categories) are, and clear up the ever prevalent confusion.
Common Confusion
Many people will use the terms “conceptual”, “logical”, and “physical” to mean what I term the “perspective”. In this usage of these terms, “conceptual” means a definition of the problem,
“logical” means a design of a solution to the problem, and “physical” means the solution of the problem.
Further, using the use of these terms is confusing given the various modeling syntaxes available. For example, the class diagrams of the Unified Modeling Language (UML) are – as we will see –
clearly at the logical abstraction level. However, class diagrams can be created with the intention of merely stating the problem, the business, or the universe of discourse and not with the
intention of detailing a solution (e.g. code, schema) to the problem. Likewise, models that are specified in the Object-Role Modeling (ORM) syntax – which we will see is clearly at the conceptual
abstraction level – can be used to illustrate a design or implementation perspective of the problem; something (for example, annotating an index on a fact’s role) that many would term to be “not
at the conceptual level”.
Even more confusing, the terms “analysis” and “design” are often used interchangeably. Some organizations still call programmers “analysts” even if they seldom talk to users or subject matter
experts. Many people performing analysis tasks (specifying systems from the user perspective, drafting use cases, etc.) will call themselves “software designers”.
Subject matter experts will sometimes intensify this problem as well. Mired in this confusion, these universe of discourse experts will often talk about user interfaces, table designs, and
performance characteristics to an analyst who is trying to get them to simply state the problem in the businesses terms. This often causes the analyst to record these non-business-related details
in the analysis artifacts.
In order to describe the solution to this confusion, two orthogonal concepts need to be defined: perspective and abstraction.
Perspective
The first orthogonal concept that needs to be defined is that of perspective. The perspective of a model lies in the motivation of why the model is being created. The question to ask your self is,
“Am I trying to define the problem, trying to divine one or more solutions to the problem, or trying to solve the problem?”
Analysis Perspective
The analysis perspective is the user’s “pain”. It is the reason you are tasked with defining and creating an information system. It is the answer to the question, “What is the problem?”
At the analysis perspective, we should attempt to get subject matter experts to detail the problem in the business terms, facts, and constraints without paying any attention (yet) to how we are
going to design or implement the solution. The terms “table”, “CORBA”, “Java”, “view”, “datatype”, and “performance” do not belong in an analysis discussion or any resulting analysis
artifacts.
User interface specifications do not belong at this level either. The author has seen many use cases written at the “analysis perspective” that refer to actors interactions with screens, reports,
and other systems. This is a mistake that will, in the author’s opinion, cause fundamental risk to the success of the project.
The analysis perspective is best detailed by using any syntaxes and/or methods that use a natural language. The use of a natural language allows the users to freely converse in a manner that is
normal to them. The analyst can then use this language as a direct input to the analysis artifacts (that will later serve as inputs to design and construction artifacts). Finally, since the
analysis artifacts are in a natural language, the users can directly and easily verify them.
Because of their use of a natural language, artifacts like ORM diagrams/specifications, use cases, natural language syntaxes such as ConQuer, and the business rules approach are extremely effective
here.
Design Perspective
The design perspective details possible solutions to the problem. It is at the design perspective that we begin to worry about things like performance, generalization and specialization, system
architectures (e.g. COM, CORBA), and database technology (e.g. relational, object-relational, object oriented, et al.).
Because of these additional concerns, we cannot rely on the analysis artifacts to specify the design. Rather, the analysis artifacts specify the requirements that any resulting design must meet. In
this manner, the analysis artifacts serve as an input to the design process and resulting artifacts.
Because designers are technical in nature, the use of natural language syntaxes and methods are no longer absolutely necessary (although it is the author’s contention that most technical people
understand natural language better than they understand technical languages and thus natural language approaches are still valuable at this perspective). Often the need to detail technical
specifications will cause the use of non-natural language syntaxes such as UML or the Object Constraint Language (OCL).
At any rate, the design perspective is clearly biased (and rightly so) towards the solution of the problem and not towards the actual problem itself.
Implementation Perspective
The implementation (or “construction”, if you prefer) perspective is now quite simply defined: it is the actual implementation of the design. In even more simple terms, it is the code or the data
definition language (DDL, often, in SQL).
Now that we have figured out what the problem is (analysis) and have decided on a solution to the problem (design) it is time to solve the problem (implementation).
Because of the unexpected events that will always happen, the implementation may not always correspond to the design. Assumptions made during design may not be true, timelines may have been
changed, bugs in third party applications or programming environments may cause the need for last minute and creative solutions, or some programmer may have re-named a design variable/member/column
to “foobar”.
More often, performance and convenience factors may cause the implementation to differ from the design. New interfaces (be they actual code “interfaces” or things like views, stored procedures,
or additional methods) may be added. New or altered indexes may be needed to speed up query time. Some part of the design may be implemented in a different language than was originally thought and
thus the design may need to be tweaked due to the new language’s support (or total lack thereof) of the design constructs used (e.g. multiple inheritance, datatypes/precision), and so on.
That aside, the implementation often looks more like the design than the analysis does.
It is also worthwhile to note that the implementation is often not only specified in terms of textual code. Implementation (graphical) models may also be created (for example, the DDL expressed in
an Entity-Relationship or “ER” syntax).
Abstraction
Now that the main perspectives have been defined, it is easy to define the second orthogonal concept: the abstraction level. The abstraction level is the level of detail of a model. Generally
speaking, the higher the abstraction level, the higher the amount of detail (one might argue that code, at the physical level, is excruciatingly “detailed”, but we will come back to that). This
allows each lower abstraction level to be derived from the higher abstraction level — a concept that is quite useful in practice.
Conceptual Abstraction Level
The conceptual level often is the most verbose model abstraction level. It contains the model constructs in such a manner as to express more detail than the lower abstraction levels. It is also the
highest abstraction level, one that can derive the lower abstraction levels (logical and physical).
For example, ORM resides at the conceptual level. ORM’s use of elementary facts, object types, and roles allow many more constraints to be specified than the logical abstraction level that ER/OO
notations specify. Incidentally, this is the main argument against the use of ORM. This argument states that ORM models are too verbose, too large, and too detailed. It is the author’s opinion
that this is a good thing overall, and that when a less verbose model is needed/desired one simply has to move to a lower abstraction level.
ORM’s use of elementary facts means that functional dependencies are fully mapped out. Because of this, later normalization errors are impossible (assuming the proper use of ORM). This is also the
root of the “level of detail” distinction: you cannot get much more detailed than expressing each and every functional dependency.
It is with this meaning of the term “conceptual” and with the previous discussion on perspective that I contend that ER notations (and for that matter UML’s class diagrams), because of their use
of the logical concepts of attributes and entities, do not reside at the conceptual level.
Logical Abstraction Level
At a lower abstraction from the conceptual level resides the logical abstraction level. However, it is as the logical level that sadly, most models are typically begun. At the logical level,
distinctions of relative importance are made. In other words, a model element is determined to be an “entity” or an “attribute” (or if you prefer, a “class” or a “member”, respectively).
Because of this lower abstraction level, many details get left out. For example, attribute level constraints (e.g. either attribute “a” is populated or attribute “b” is, but not both) can no
longer be expressed.
Further, if models are begun only at this level, the risk of serious errors skyrockets. The decision of relative importance means that normalization errors may occur and that an element that is
modeled as an attribute will later need to be re-modeled as an entity. Further, many of the semantics of the relationships between model elements get lost. For example, at the conceptual
abstraction level, each element that later becomes an attribute has distinct roles and domains. These roles and domains are often altered at the logical level. The semantics of the attribute cause
the attribute to be named in such a manner as to indicate its role with the entity. However, if you are not careful, this re-named attribute may be mistaken to be of a different domain.
As a trivial (but taken from an actual reverse engineering effort) example to illustrate this point, consider a division code. A division code is a specific domain (it has a specific — and in this
case, a finite — set of allowable values). Now, consider the attributes of a legacy system: “Manufacturing Division Code”, “Employed By Division Code”, and “Corporate Group Code”. The first
two attribute names obviously imply a common domain; namely, Division Code. The latter attribute is also a Division Code, but its name does not imply this fact at all. At the conceptual level,
these attributes may be specified as: “Product(id) is manufactured by Division(code)”, “Employee(ssn) is employed by Division(code)”, and “Organization(name) belongs to the corporate group
designation of Division(code).”
The conceptual level’s use of facts (objects playing roles with each other) unambiguously defines domains and semantics. It is at the logical level that we may shorten those semantics via
attribute names. However, if we begin our modeling effort at the conceptual level, the domains map to the lower level consistently, completely, and accurately.
In summary, it is not a bad idea to display a model at the logical level (indeed, the author frequently uses logical level models). But it is a bad idea to begin a modeling effort at the logical
level.
Physical Abstraction Level
The physical abstraction level is easier to define: it is at the “code” level. It is often only used to specify the implementation perspective, but we will talk more about that in the next
section.
This abstraction level may also elaborate other implementation details. For example, a many-to-many relationship between two (logical level) entities “a” and “b” may be illustrated via the
relational DBMS necessity of a one-to-many relationship between the tables “a” and (new, intersection) “ab” and a many-to-one relationship between tables “ab” and “b”.
Further, it is at the physical abstraction level that implementation details such as datatypes (e.g. is it “VARCHAR(5)”, “VCHAR(5)”, “VARCHAR2(5)”, or “CHAR(5)”?) or creation syntax (e.g.
vendor specific SQL dialects) are shown.
The Coupling of Perspective and Abstraction Levels
Now that we have flushed out the details, let us look again at the common confusion that surrounds our industry when using the terms of “analysis”, “design”, “implementation”, “conceptual”,
“logical”, and “physical”.
Perhaps the root of this confusion is that people tend to like to see a perspective of a system in a particular abstraction level. For example, the author prefers to see the analysis perspective in
terms of the conceptual abstraction level of ORM. Others prefer to see the design perspective in terms the logical abstraction level of ER or UML’s class diagrams. Finally, models from the
implementation perspective are almost always shown at the physical abstraction level.
However, in the manner that I have defined abstraction levels and perspectives, it is theoretically possible to couple them (in other words, match the perspectives with the abstraction levels used
to illustrate those perspectives) as follows:
Perspective | Abstraction Level |
Analysis | Conceptual, Logical, & Physical |
Design | Conceptual, Logical, & Physical |
Implementation | Conceptual, Logical, & Physical |
Table 1: Possible combinations of perspectives and abstraction levels
However, the typical industry use (judging unscientifically from the trends that the author sees) of the perspectives and the abstraction levels used to express them are as follows:
Perspective | Abstraction Level |
Analysis | Logical |
Design | Logical |
Implementation | Physical |
Table 2: Typical combinations of perspectives and abstraction levels
However, because of the expressibility of models in the conceptual abstraction, design perspectives expressed in a conceptual syntax are quite valuable. Further, the formal nature of typical
conceptual level languages such as ORM and ConQuer offer the chance for Computer-Aided Software Engineering (CASE) tools and code generators to automatically forward engineer a good deal of the
eventual implementation. It is also worthwhile to note that many people who use logical syntax to express analysis perspective will call those artifacts “conceptual” – thus causing confusion.
Thus, the author tends to use the following combinations of perspectives and abstraction levels:
Perspective | Abstraction Level |
Analysis | Conceptual |
Design | Conceptual & Logical |
Implementation | Physical |
Table 3: Ideal combinations of perspectives and abstraction levels
The coupling between perspectives and abstraction levels can also yield efficient and accurate overall project processes. For example, in an OO process the artifacts at each “phase” (roughly,
what the author has termed “perspective”) feed the subsequent phases. In this manner, analysis use cases serve as an input to the design class models that can then be forward engineered into code
(physical implementation). Using the coupling shown in Table 3 one finds that the conceptual abstraction level (e.g. ORM) from the analysis perspective can be used to automatically generate a
logical abstraction level that can easily serve as the basis for the design perspective class diagram artifacts. However, this topic is a subject for a later series of articles and presentations,
so it will not be elaborated further here.
Conclusion
The paper has attempted to clear up the confusion resulting from the typical use of the terms “analysis”, “design”, “implementation”, “conceptual”, “logical”, and “physical”. It defined
them (and perhaps re-defined them) in an orthogonal manner in order to yield the ideal way to look at modeling and implementing an information system.
Maybe the usage of these terms have gotten so confusing that we need to define some new terms.
It is the author’s opinion that the terms “analysis”, “design”, and “implementation” (or “construction”) are clear and orthogonal. However, perhaps the terms “conceptual”, “logical”,
and “physical” are less orthogonal and need new words to define those abstraction levels.
Since I have defined the conceptual abstraction level to be at a higher level of detail, the terms — if replaced — should be replaced by better nomenclature that indicates this level of detail.
However, as we saw in the section on coupling perspectives with abstraction levels, it is not always clear as to what the difference between the physical abstraction level “detail” and the
perspective implementation “detail” is. It may even be argued that there is little to no difference between them.
In this manner, perhaps the physical abstraction level should be elided altogether in favor of a proper distinction (defined above) between the conceptual abstraction and the logical abstraction
level as well as clear separation (also defined above) between the analysis, design, and implementation perspectives. Or maybe the physical abstraction level is simply useful for graphically
expressing the implementation. In this manner, it would seem that the graphical notation used could simply be the same as the logical notation(s) used, only adorned for additional physical
considerations (e.g. explicitly displaying the vendor-specific datatypes).
This seems to be the route taken by most of the CASE tool vendors. And perhaps that is the reason for the confusion.
Opinions Wanted
If you would like to offer an opinion on this topic, or perhaps even come up with better nomenclature, feel free to post your opinion on the JCM Discussion List. Subscription to this list is easy:
simply send an e-mail to jcm-subscribe@egroups.com and reply to the confirmation message that you will receive. From there, all you have to do is send
your comments to jcm@egroups.com. Or if you prefer, you may send any comments to the author privately; his contact information is in the bio below.
Further Reading
More information on Object-Role Modeling may be found in [12] and [15]. For more information on
ConQuer, see [10], [11], and [13]. For a look
at the business rules approach, see [7]. For a good comparison of ORM to UML’s class diagrams that further illustrates the division between the
conceptual and logical abstraction level and thus (in this author’s opinion) why UML’s class diagrams do not reside at the conceptual level as defined above, see [16]; for a similar look at ER vs. ORM, see [14]. For another look at perspectives, see [18]. Further information on how OO processes define perspectives as “phases” may be found in [17].
Many articles exist on the usefulness of specifying design models at the conceptual abstraction level. A subset of these are [1][2], and [4]. For a look at more benefits of using conceptual modeling techniques for domain enforcement, see
[3]. For a look at using conceptual modeling techniques instead of normalization, see [5] and
[6]. For a look at how conceptual level models easily feed the design and implementation perspectives, see [8] and [9].
References
[1] Barden, Dick and dela Cruz, Necito, Data Warehouse Construction: Transforming STARNET into a Star Schema via Object Role Modeling (available at
www.inconcept.com)
[2] Barden, Dick and Hallock, Patrick, Detecting the Need for Paired Subsets, the Journal of Conceptual Modeling, Issue 11 (available at
www.inconcept.com/JCM)
[3] Becker, Scot A., An Argument for the Use of ER Modeling, the Journal of Conceptual Modeling, Issue 10 (available at www.inconcept.com/JCM)
[4] Becker, Scot A., Case Study: Delaying the “Attribute or Entity” Decision, the Journal of Conceptual Modeling, Issue 13 (available at
www.inconcept.com/JCM)
[5] Becker, Scot A., Data Schema Normalization, the Journal of Conceptual Modeling, Issue 9 (available at www.inconcept.com/JCM)
[6] Becker, Scot A., Normalization and ORM, the Journal of Conceptual Modeling, Issue 4 (available at www.inconcept.com/JCM)
[7] Date, C.J., What Not How: The Business Rules Approach to Application Development, Addison-Wesley, 2000 dela Cruz, Necito, Success Story: Much
Ado about ORM Modeling, the Journal of Conceptual Modeling, Issue 12 (available at www.inconcept.com/JCM)
[8] Hallock, Patrick, Composite Objects in Relational and Object Relation Constructs Using InfoModeler 3.1 Parts 1-2, the Journal of Conceptual
Modeling, Issues 1-2 (available at www.inconcept.com/JCM)
[9] Halpin, Terry, Conceptual Queries, Database Newsletter (vol. 26, no. 2) (available at www.orm.net)
[10] Halpin, Terry, Conceptual Queries using ConQuer-II, Proc. ER’97: 16th International Conference on Conceptual Modeling, Springer LNCS,
no. 1331, pp. 113-26. (available at www.orm.net)
[11] Halpin, Terry, Conceptual Schema and Relational Database Design, Second Edition (revised), WytLytPub, 1999
[12] Halpin, Terry, ConQuer: a Conceptual Query Language, Proc. ER’96: 15th International Conference on Conceptual Modeling, Springer LNCS,
no. 1157, pp. 121-33. (available at www.orm.net)
[13] Halpin, Terry, Entity Relationship Modeling from an ORM perspective Parts 1-5, the Journal of Conceptual Modeling, Issues 11-15
(available at www.inconcept.com/JCM)
[14] Halpin, Terry, Object Role Modeling: An Overview (available at www.orm.net)
[15] Halpin, Terry, UML Data Models from an ORM Perspective Parts 1-10, the Journal of Conceptual Modeling, Issues 1-10 (available at www.inconcept.com/JCM)
[16] Jacobson, Ivar et al., The Unified Software Development Process, Addison-Wesley, 1999
[17] Leska, Paul, Jr., Avoiding Problem Solution Confusion, the Journal of Conceptual Modeling, Issue 14 (available at www.inconcept.com/JCM)
[18] Rosenberg, Doug and Scott, Kendall, Use Case Driven Object Modeling with UML, Addison-Wesley, 1999
© Copyright, 2000, InConcept. All Rights Reserved.