INTRODUCTION
The current data modeling tools support concept of the Project Data Model as a subset of the Enterprise Data Model. The material below argues that it is a simplistic understanding of the problem
and the Project Model should be defined as a constrained subset. Some compromises usually made during project modeling are also discussed.
SELECTION AND PROJECTION SUBSET
The distributed database system requires fragmentation of the global relations (tables) amongst the local databases [Bell and Grimson]. For example, the global relation can define an enterprise
view, and the local database can present data for a specific organizational unit, business area, or an employee. The fragmentation includes selections, which is a creation of attribute subsets, and
projections, which is a creation of entity instances subsets. The reconstruction of a global relation from its fragments is done through joins and unions.
The concepts of Local Scheme and Project Model are similar. Project Model defines data for a given business area, or even for a given application within it. A business area is interested in a
limited number of objects (entities, and attributes) out of the enterprise-wide set. In this regard the Project Model can be thought of as a subset of the Enterprise Data Model. The filtered view
that includes a subset of entity types is supported by the current data modeling tools. A subset of entity instances is supported by the subtype capabilities of the tools. Some of the tools even
let you choose individual attributes to be included/excluded in a project view.
ADDITIONAL CONSTRAINTS
A given business area may deal with only a subset of instances of an enterprise-wide entity. Let’s assume that those instances are covered by subtype(s) of the entity type. By definition, the
subtype inherits relationships from the supertype. However the relationship for subtype might be more restrictive than the more generic supertype relationship, or an additional
relationship/constraint may exist that does not hold enterprise-wide. The same fact can be stated in the opposite direction, if more than one business area work with the same entity and they have
different requirements on relationships, the Enterprise Data Model should accommodate the most flexible relationship.
The Project Model should reflect the more constrained relationship. It frees application code from a need to support the business rule. The precise modeling of business rules simplifies the
application design.
Example: An employee can be hired and not be assigned to a department, so the relationship from employee to department is optional at the enterprise level. For any department level application only
employees assigned to the department is of interest, so there is a mandatory relationship between a subset of employees and a department. This mandatory relationship constraint can simplify the
application design. Why not use it in the department-level databases? Why not capture it in the Project Model?
Another example: Inspectors have overlapping territories, and the many-to-many relationship is resolved by an Inspector ZIP associative entity. A laptop-based tool is developed for the inspectors.
The universe of the tool is limited to a single inspector. Here the one-to-many relationship from Inspector ZIP to ZIP is in effect restricted to one-to-one. The enterprise view supports both this
tool and a territory distribution application, so it still needs a more flexible one-to-many relationship.
Intuitively, the Project should include less meta-data than the Enterprise. The addition of constraints extends Project meta-data beyond the Enterprise Model, which contradicts this intuitive
notion. The contradiction is superficial. The most important item is not an object count, but a direction from a flexible model to a more restrictive one. The subset extraction and constraints
complement each other here.
According to [Atzeni and De Antonellis] the selections and projections are considered to be restrictions. They prevent applications from changing the excluded data. The constraints, such as
changing relationship from optional to mandatory, obviously restrict the data as well.
PROJECT COMPROMISES
Deployment of the Enterprise Data Model requires its tailoring to the needs of each individual project. Problems with using the Enterprise Data Model are most apparent with small applications. For
example, let’s assume management reports need 20 attributes. Playing by the book, the Data Administrator pulls entities with the attributes, then associative entities. Then supertypes of the
previous entities are included, and supertypes of the supertypes. Then more entities are added to preserve access paths for foreign keys. The total comes to whopping 40 entities. No cries of
overall goodness can save the model from developers’ scorn. If we can at least leave supertype entities off the picture and short-circuit access paths, then the model has a chance to be
implemented.
ACCESS PATHS
If a given entity belongs to the business area, than the entities tightly coupled to it are likely to be included too. Sometimes within a grandparent-parent-child chain the business area may not be
interested in the attributes of the entity in the middle of the chain. Current tools require showing all the entities in order to propagate the keys. On the other hand, removing of unused entities
can reduce complexity of a project model. One of the possible compromises is to collapse unused entities into a relationship. The role of a new relationship is to preserve the data access path. It
is a derived relationship based on removed entities and relationships. Because of its derived nature, it can not be updated directly. Only change to the underlying objects can trigger the change to
the derived relationship. Any application which does not have access to the base objects should treat this data as read-only.
Example: The usage of computer resources is captured on an employee level, and reported on an accounting company level. An enterprise-wide organizational hierarchy includes employee, unit,
department, division, and accounting company. The intermediate organizational levels are not in the project requirements. The project is not responsible for any organizational changes. It is
reasonable to include only the employee and the company instead of the full hierarchy. The excluded hierarchy implied a many-to-one relationship between remaining employee and company entities. By
adding the employee-company relationship to the diagram, we allow tools, such as query builders, to take advantage of it.
TOOLS PROBLEMS
All the transformations must be documented as project specific, so they don’t sneak into the Enterprise Model and corrupt its integrity. The introduction of additional constraints, and derived
relationships pose a challenge to the data modeling tools, because now project model has additional objects compared to the enterprise model. Current data modeling tools don’t support the
constraint transformations, so the automated mapping between a full model and its constrained subset is not feasible.
SYNOPSIS
The mapping from Enterprise Model to Project Model includes extracting subsets of entity types and attributes, as well as adding constraints. It allows the Project Models to follow closely business
rules without compromising integrity of the Enterprise Data Model.
Bibliography
1. P. Atzeni, V. De Antonellis, Relational Database Theory, Benjamin/Cummings, 1993.
2. D. Bell, J. Grimson, Distributed Database Systems.: Addison-Wesley, 1992