Modeling Business Rules – What Data Models Cannot Do

Published in TDAN.com April 2004

As we saw in the last issue of TDAN.com, Modeling Business Rules: What Data Models Do, a data model can show two and a half of the four kinds of
business rules: terms, facts, and the result of derivations. In this issue, we’ll find that it can show only a limited number of the fourth kind: constraints. In the next issue we’ll
show some extended notations and the metamodel for what cannot otherwise be shown. A fourth article will show how, while the rules cannot be shown in a model, they can be defined in a way that
allows their content to be captured in data that can be shown.

To highlight what of a business rule a data model can and cannot represent, a metamodel of business rules is presented here. This model shows the nature and structure of the rules we are trying to
capture, so that we can highlight which aspects are and are not captured by any particular CASE Tool or method. The model shown here is derived from previous TDAN.com articles, A Repository Model – Business Rules (Structural Assertions and Derivations) and A Repository Model – Business Rules (Action Assertions). This model illustrates the interactions between rules to be captured and the data model technique.


Constraint

A constraint is a condition that determines what values an attribute or relationship can or must have. Constraints are supported on an entity/relationship diagram only to
a limited degree.

  • Cardinality can be shown.
  • Unique identifiers can be shown.
  • “Arcs” (mutually exclusive relationships) can be shown in the SSADM/Barker notation, and inter-role constraints can be shown in UML and ORM.
  • Optionality may only take a fixed value.
  • Domains are not shown unless as “type” entity classes.

The example model in Figure 1 shows some of these constraints:

  • Order is identified by the attribute “Order number” (shown by the octothorpe {#}).
  • Line item is identified by the attribute “Line number”, plus the relationship (role) part of order (shown by the tick mark across that end of the line).
  • Each line item must be part of an order (solid line), and it cannot be related to more than one order (shown by the lack of a crow’s foot).
  • Line item “Quantity” is required to take a value for each occurrence (shown by the asterisk).
  • Product type “Unit price” may or may not take a value (shown by a circle).
  • Each line item must be eitherfor one product type orfor one service (shown by the arc across the two relationship lines).







Figure 1: Example of Constraints on the Model (mouse-over image to enlarge)

The notation shown in these articles is the one used in the European methodology, SSADM, and promoted by the Oracle Corporation. It is used here because it is the most versatile of the
entity/relationship notations, while remaining relatively easy to read. In the Information Engineering notation, many of these constraints can also be shown, but the symbology is different. Indeed,
originally, Information Engineering did not show attributes and did not have symbols for unique identifiers. CASE Tool vendors, however, have added these, each in its own way. None of them includes
arcs or any other inter-role constraints.

UML does have a more general concept of inter-role constraints than is shown here. It does not, have any concern for unique identifiers, although many practitioners add these through
“stereotypes”.

ORM has a much more sophisticated way of dealing with inter-role constraints, but this will be discussed in more detail in the next article.


Unique Identifiers – Metamodel

Figure 2 shows the metamodel for unique identifiers. It asserts that each entity class may be identifying its occurrences via one or more unique identifiers (one of which will have its
“Primary indicator” set to “True”. Each unique identifier, then, must be composed of one or more unique identifier elements, each of which is the use of either an attribute or a role.

This model itself shows examples of unique identifiers, in the form of business term being identified by “ID”, and unique identifier element being identified by a combination of
“Sequence number” and the relationship that “Each unique identifier element must be part of one unique identifier.” That is, a unique identifier occurrence
partially identifies each occurrence of unique identifier element.

Figure 2: Metamodel of Unique Identifier Constraint


Cardinality – Metamodel

The cardinality of a relationship or attribute is the maximum number of occurrences that are possible for a given entity class.

In the case of attributes, the maximum cardinality must be 1 for a relational database, but sometimes it is necessary to model un-normalized data, so the metamodel (Figure 2) must provide a
“Cardinality indicator” for attribute. The default value should certainly be “False”.) The indicator would be “True” if more than one occurrence of an attribute were possible. This cannot be
shown in an entity / relationship model, but it can be shown in UML.

Maximum cardinality is more meaningful for roles. For a given occurrence of an entity class, may it be related to more than one occurrence of another entity class? This is shown as the role
attribute “Cardinality indicator”. If it may, the “Cardinality indicator” has the value “True”. If the role can be only be connected to a single occurrence of an entity
class, then “Cardinality indicator” has the value “False” This is shown in the entity relationship models presented here by the presence of a “crow’s foot” if more than
one occurrence is possible, and its absence if the role is constraint to only one occurrence of the target entity class. Other notations use different symbols, but the meaning is the same.

Figure 3: Metamodel of Cardinality Constraint


Inter-role Constraints

The arc in Figure 1 is an example of an inter-role constraint. In the notation being used here, this is the only available option. It means “exclusive or”—that is, in the example, each line
item must be eitherfor one and only one product type orfor one and only one service. UML has a more general concept of inter-role constraints, with a dashed line
across the affected relationships. The meaning of the line depends on the situation, however, and that meaning is determined by a label attached to the constraint..

The metamodel for this constraint is shown in Figure 4. Here, the constraint can be to constrain either an entity class or a relationship. In the case of the arc, above, the
constraint is to constrain the entity class line item and it is composed of two constraint elements:

  • Each line item must be for one and only one service.
  • Each line item must be for one and only one product type.

Figure 4: Metamodel of Inter-role Constraint


Optionality

To look at an entity / relationship diagram (or a UML diagram), “optionality” seems straightforward. An attribute is required or not. A relationship is required or not. But this is misleading. In
fact, very often a value for either is initially optional, but eventually must be filled in. It is common to not require entry of a value on a screen, but rather to notify the
operator that one must be added before some other event can happen. Clive Finklestein has upgraded the Information Engineering notation to account for a relationship “initially may be, but
eventually must be”, but even this seems inadequate to describe the full meaning of optionality.

Figure 5 shows an example of this. This model is about physical assets, and the accounting for them in asset accounts. Governments and regulatory agencies frown upon defining a financial asset
account when there is no physical asset that corresponds to it. So, each asset account must bean accounting of one or more physical assets. On the other hand, it is typical for an
organization to acquire a physical asset well before it is recorded in the general ledger. So, each physical asset may beaccounted for in one and only one asset account.

What the data model cannot show are the circumstances under which the relationship becomes mandatory.

Figure 5: More That Data Models Can’t Do

The metamodel of this problem is shown in Figure 6. This diagram recognizes that the optionality of an attribute or a role must be based on the entity class state—is the entity class
just being created? Is the attribute or role going to be referred to by something else? And so forth.

Each entity class state type is a definition of a kind of entity class state, such as “Initial entry”, “Attribute Y in entity class X has taken the value of Z”, etc.

In our example, the question is, what is the state of the asset that makes it required to be accounted for in an asset account? The entity class state type could be as simple as “Passage
of time in weeks”, with the “Value” of the entity class state being “3”. Or it could be something such as “asset received in warehouse”.

This kind of subtlety cannot be represented on a two-dimensional entity / relationship diagram. It must be documented elsewhere.

Figure 6: Metamodel of Optionality


Other Constraints

The limited number of constraints described above do not begin to cover the extent to which constraints can be applied to an organization’s data—but cannot be shown in an entity /
relationship diagram. All these other constraints are going to have to be documented in a different way than using a data model.*

This problem is accentuated by the tendency to generalize data models, thereby removing some constraints that can be made explicit. The rationale behind generalizing is that it makes the models
more robust—less vulnerable to changes in the business. Indeed the structure of an organization’s data can be made to reflect fundamental, unchanging characteristics of the
business. It is the constraints on that structure that are more vulnerable to change, so perhaps they should be managed separately from data structure.

To see an example of this, look at the left side of Figure 7. Here you see a typical representation of an organization structure. Each group must be part of a department; Each department
must be part of a division, etc. The problem with this is that the company is prone to changes in the nature of its organization structure. I a consultant comes in and recommends adding
section between department and group, suddenly your database requires major surgery. A more robust solution is shown on the right, where an organization is simply part of another
organization. This means that organizations can be added and the structure may be changed at any time, without ever having to affect the database structure.

It also means, however, that you cannot prevent a particular division’s being part of a group, or a company’s being part of a department. The constraints in the first model that
forced groups to be part of a department, etc. do not exist in the second model.

Figure 7: Generalizing Models

Figure 8 appears to offer a solution to this, adding the concept of organization type, and the notion that each organization type may be part of one and only one other organization type.
But this doesn’t really constrain the organization relationship. Just because an organization must be an example of an organization type, there is nothing to constrain the part of
relationship in organization to be consistent with the part of relationship in organization type.

In addition, the organization entity class shown on the left of the Figure has nothing on the drawing that keeps the part of / composed of relationship from looping. (organization A is
part of organization B and organization B is part of organization A.)

Figure 8: What Data Models Cannot Do

Figure 9 shows another kind of model that stymies our attempts to represent constraints. This is the basic laboratory model, where each sample is drawn according to one and only one sample
method. Each sample is then subject to one or more tests to determine its characteristics. Each test, in turn, is an example of a test type, that defines the characteristics of the test.
Separately, each sample method is the basis for one or more test requirements, each being the requirement for a particular test type. That is, material collected via a particular
sample method(such as “fill a beaker”), can only be tested via a specified set of test types (such as “pH”, “viscosity”, etc.).

There is nothing on the diagram, however, that says a test cannot be conducted that is an example of any kind of test type that exists.

But note, however: the language of the rule is fully present in the diagram even if the actual constraint itself is not. This is significant. The rule is data-driven. This will be
discussed in the fourth article.

Figure 9: A Data Driven Rule

[Next issue, we will discuss some notations that do represent at least some business rules, and will complete our metamodel of the structure of a business rule.

*Some additional constraints can be represented in Object Role Modeling, which is a different approach to data modeling. Some of these will be described in
the next article in this series. Even in that case, however, there are constraints that cannot be represented.

Share

submit to reddit

About David Hay

In the Information Industry since it was called “data processing”, Dave Hay has been producing data models to support strategic and requirements planning for thirty years. As President of Essential Strategies International for nearly twenty-five of those years, Dave has worked in a variety of industries and government agencies. These include banking, clinical pharmaceutical research, intelligence, highways, and all aspects of oil production and processing. Projects entailed defining corporate information architecture, identifing requirements, and planning strategies for the implementation of new systems. Dave’s recently-published book, “Enterprise Model Patterns: Describing the World”, is an “upper ontology” consisting of a comprehensive model of any enterprise—from several levels of abstraction. It is the successor to his ground-breaking 1995 book, “Data Model Patterns: Conventions of Thought”–the original book describing standard data model configurations for standard business situations. In addition, he has written other books on metadata, requirements analysis, and UML. He has spoken at numerous international and local data architecture, semantics, user group, and other conferences.

Top