Introduction
There are four data models that form the overall data model topology within an enterprise. These are depicted in Figure 1. A description of each model and their respective meta-model is included in this feature:
-
Data Element Model
-
Concepts Model
-
Logical Database Model
-
Physical Database Model
-
The Data Element Model (and Meta-Model)
The Data Element Model represents all the metadata necessary for the specifications of business facts across the enterprise regardless of their containers. That is, concept, database, screen, reports, and the like. The meta-entities necessary for the data element model are depicted in Figure 2. Here, a Data Element is a context-independent business-fact semantic template. Its function is to be the overarching container of semantics for:
-
Business facts that can be deployed across collections of entities from one or more entities within different subjects.
-
Business facts that can be deployed across collections of columns from one or more tables within different logical database schemas.
Data Elements are not simple. Rather, they are the result of a cascading set of data-related understandings and refinements. First there are data-related concepts that, when bound into Information Technology environments, become data. Here a concept that ultimately would results into specifications for databases might be Persons, Real Property, Abstract Property, Geography, Transportation, Finance, Manufacturing, and the like
While Concepts are the abstract textual descriptions that ultimately form the basis for data specifications, they are not the descriptions of the data types that can be employed to build data elements. Needed are the Conceptual Value domains, and examples include Dimensions, Weights, Geographic Coordinates, Strings, and Money.
Data Elements are not simple. Rather, they are the result of a cascading set of data-related understandings and refinements. First there are data-related concepts that, when bound into Information Technology environments, become data. Here a concept that ultimately would results into specifications for databases might be Persons, Real Property, Abstract Property, Geography, Transportation, Finance, Manufacturing, and the like
While Concepts are the abstract textual descriptions that ultimately form the basis for data specifications, they are not the descriptions of the data types that can be employed to build data elements. Needed are the Conceptual Value domains, and examples include Dimensions, Weights, Geographic Coordinates, Strings, and Money.
When Concepts and Conceptual Value domains are joined they act as hosts for a more refined set of conceptual value-domain based semantics, called Data Element Concepts. Examples of Data Element Concepts include, for a person: Biographic Information, Skill Information, Person Past Employment Information, and the like. For Finance, it might include Ledger Specification Information and Finance Transaction Information. For Geography, it might include Geographic Locations. For each of these Data Elements there is a well-defined concept, and assigned value domains.
Conceptual Value Domains are further refined into well-known Value Domains. Included would be: Money, Gender, Metric Dimension units, Weight Units, Geographic Point Specifications, and the like.
Subordinate to Value domains are explicitly allowed values. For example, a range of dates for allowed genders, Job Classifications, Employment Separation Codes, and the like. These Value Domain Value sets are thus bound into specific data elements, attributes, or columns.
The final joined-component pair for Data Elements are Data Element Concepts and Value domains. An enumeration of data elements results. Included might be: Person First, Street Name, House Number, Financial Instrument Identifier, Financial Instrument Amount, Real Product Name, and the like.
Once the Data Elements are determined, they are ready to be employed as semantic templates for use in the Concepts Data Model and the Logical and Physical Database Models.
The Concepts Data Model (and Meta-Model)
A Concepts Data Model is a data model of concepts that is independent of use within any databases. Concepts Data Models of concepts are defined through entities, attributes, and relationships within specific subjects. There is a many-to-many relationship between the Concepts Data Model and the Logical Database Model. The meta-model for the Concepts Data Model is presented in Figure 3.
In Concepts Data Models, entities can have subtypes to any depth. An example might be for education which may have a common set of attributes for allclasses of education, and subtypes for each different education class such as high school, college, or trade school.
Attributes represent the deployment of data element semantics within entities. Because of this one-to-many relationship between data element and attribute, users can see which attributes employ a data element’s semantics regardless of the entity or subject.
Attributes can also be assigned various meaning modifier semantics and data use semantics. This enables automatic name construction for attributes. Automatic definitions and abbreviations are also enabled because there is a persistent relationship between the assigned semantic and the attribute. This too enables finding and reporting attributes regardless of entity and subject on the basis of either a semantic or data use modifier.
There can also be relationships among the entities within one subject, and there can be relationships between entities across subjects. If this were just a data model in a conceptual form, there could not be relationships across different subject-based data models. Each conceptual data model would be a stove-pipe that could only be transformed into a logical stove pipe data model and thereafter into a physical stove pipe data model.
Again, Concepts Data Models are not conceptual forms of a data model. Rather, they are well-developed data models in their own right. Each entity within a Concepts Data Model should be in third normal form. Concepts Data Models are persistent. Their form remains as defined. Its structures—that is, the data models of the concepts—are available as templates for the construction of one or more Logical Database Models.
The Logical Database Model (and Meta-Model)
Logical Database Models are models of databases independent of any particular DBMS such as Oracle or DB2.
Logical Database Models, illustrated in Figure 4, contain schema, table, column, SQL data types, primary key, primary key column, foreign key, foreign key column, candidate key, and candidate key column.
The Logical Database Models inherit their data structures from one or more Concepts Data Models, and in turn, provide these database data structures for use in the construction of DBMS specific database data models, that is, Physical Database Models.
Logical Database Models are defined as tables, columns and relationships within schemas. Tables cannot be related across schemas. All the tables in this data model generalization level should be in at least third normal form to ensure data model clarity and quality design. There is a many-to-many relationship between a Logical Database Model and a Physical Database Model. This is because a given Physical Database Model might be a data warehouse in which some of its DBMS tables and columns are inherited from multiple Logical Database Models.
There can be relationships among the tables within one schema as well as data types, assigned value domains, and other appropriate metadata for complete data models. There can also be assigned semantic modifiers, data use modifiers, and value domains assigned in the Logical Database Model.
Thus, as with the Concepts Data Model, there can be automatic name, definition, and abbreviation construction. Because the Logical Database Model comes from a metadata management system database, any semantic assignment conflicts between the Concept and Logical Database Models are immediately discovered and prevented.
If the semantics already assigned to attributes of entities within the Concepts Data Model are sufficient, these are automatically inherited by columns of tables in the Logical Database Model.
Each table may be constructed through the deployment of a collection of attributes from a single Concepts Data Model entity, or collections of attributes from multiple Concepts Data Model entities within one or more subjects.
An entity or sub-collection of attributes from one entity can be used to form different data structures in multiple tables. The relationship between the Concepts Data Model and the Logical Database Model is thus, many-to-many, as it exists in real-world data modeling environments.
Tables in the Logical Database Model can be subtyped. Columns can also be complex to support nested structures of arbitrary depth. The data types assigned to a Logical Database Model are those allowed by the ANSI SQL standard.
Finally, because the Logical Database Model is built from a foundation of the Concepts Data Models, a complete cross reference between the two classes of models is easily generated.
The Physical Database Model (and Meta-Model)
Physical Database Models are models of databases that conform to the requirements of a particular DBMS such as Oracle or DB2. Additionally, a Physical database model’s design must conform to the expected processing requirements of the particular set of database applications supported by the particular database. Operational data models are somewhat equivalent to the term Physical Database. The meta-model for the Operational Data Model is set out in Figure 5.
Physical Database Model designs vary because of many factors such as transaction volume, the host operating system, and the computer hardware size and throughput capabilities. Regardless, DBMS tables and/or DBMS columns of a Physical Database Model database should all relate back to tables and/or columns of a Logical Database Model.
Physical Database Models are defined as DBMS tables, DBMS columns and relationships within DBMS schemas. “DBMS” is employed here to signify that these data model components are tied to a specific DBMS.
The Physical Database Model is similar to the Logical Database Model in its characteristics. The triple for the Operational Data Model is DBMS Schema, DBMS table, and DBMS column.
A key difference in the Physical Database Model from the Logical Database Model is that it represents the data structures that result in SQL DDL streams that feed the DBMS to actually make the databases.
There may be multiple Physical Database Models for a given Logical Database Model. Each Physical Database Model may contain a different subset of a Logical Database Model’s tables and columns, or in some cases, all its tables and columns. It may also be the case that a complete set of a table’s columns are transformed into differently named columns. For example, there might have been a Logical Database Model table for telephone numbers that, in the Physical Database Model, are transformed into TelephoneNumber-1 through TelephoneNumber-5. This is a kind of Physical Database Model denormalization may be created to improve performance.
There may also have been multiple Logical Database Models for a given Physical Database Model. This commonly occurs in data warehouse data model designs where the Physical Database Model for the data warehouse is sourced from multiple Logical Database Models.
The Physical Database Model can contain subtyped DBMS tables, and substructures in DBMS Columns. It can also have an assigned value domain and the data types reflect that of the containing DBMS.