Database Administrators (DBAs) do not need meta data. You read that statement right. They don’t need meta data. Here – I have been an advocate of meta data management for close to eight years and
I come out with a statement like that. Unbelievable … or is it? Database Administrators don’t need meta data because they already have what they need. Well … let me just say that
they have enough to do their job and do it well. In comparison to data modelers, application developers, and business / data analysts, DBAs are way ahead of the curve.
The database catalog, the sidekick of the DBA, is a meta data database. The data in the catalog defines how the data is structured in the database. Desktop and mainframe tools used by the DBAs read
and write meta data directly from and to the database catalog. Data definition language (DDL), an executable script of meta data, is created and executed by the DBA to change the database catalog.
The catalog is THE major resource of information (read meta data) about databases.
The database catalog is a perfect example of, to steal a phrase from data modeling / meta data lecturer and author – William G. Smith, “in-line” meta data. By definition, changing in-line meta
data directly impacts the definition, design, structure, movement, accessibility, … of the data and the processes that use the data. Changing a table index in the database catalog changes
the actual index of the table. Changing the database catalog can be dangerous. This is one reason why, in most cases, only DBAs have the privilege of making changes to the catalog.
DBAs use “in-line” meta data. So what? Why should the DBAs be concerned with the meta data health of the organization when they have what they need? How important is the DBA’s meta data to the
rest of the organization? How important is the Oracle DBA’s meta data to the DB2 or the SQL Server DBA? What role does the DBA play in the overall context of meta data to the organization? The
purpose of this article is to demonstrate where DBA meta data fits into the enterprise meta data picture and how the DBAs can become a partner with positive influence on meta data and information
resource management success.
DBA as the Enterprise Meta data Hub
Lets start with the first question first. Why should the DBA be interested in meta data? Using basic proof theory: 1) DBAs understand the importance of the data in the database catalog, 2) the data
in the database catalog is meta data, therefore, 3) the DBAs understand the importance of meta data. Too easy? Perhaps not. If the DBA can imagine life for one day without access to the catalog,
they may understand how others in the organization feel when they don’t have information about the data that they need to perform their job function. So how can the DBA help?
The meta data in the database catalog is central to the Core Meta Model displayed in Figure 1. The databases, tables / views of data, indexes, and the columns (represented as meta data types on the
diagram) are vital links (or the Hub) in the definition of meta data to support the enterprise. Without DBA meta data, there is no connection between the business definition of the data (in the
data model) and the data. Without DBA meta data, there is no map / relationship between source and target data. Without DBA meta data, there is no connection between the SQL and the data. Without
DBA meta data, the enterprise structure of meta data (at the technical level) does not exist.
Figure 1: Core Meta Model
mouse-over to enlarge
It is almost too simple to think that such a small number of meta data types from the database catalog (the databases, tables / views, and columns) can have such an impact on the ability to manage
enterprise meta data. But it is true. The rest of the article will analyze the relationships between the DBA and the data modeler, the DBA and the application developer, and the DBA and the
business / data analyst. This analysis will focus on the meta data as the driver of these relationships.
The DBA and the Data Modeler
The interaction between the DBA and data modeler typically involves the passing of information from one individual to the other. This hand-off of information always includes data about data,
whether it is logical information (model entities, attributes, and keys) coming from the data modeler or physical database information (databases, tables, columns, and indexes) coming from the DBA.
The information is meta data. The strength of the relationship between the modeler and the DBA often is based on the quality of the meta data that is passed from one individual to the other.
Changes Made by the Modelers First
In an IT environment where changes in data and data requirements feed through the data modelers to the DBAs, the data model is changed prior to changes to the physical database design. In this
environment, the information that is passed from the data modeler to the DBA typically takes on one of three forms:
- a logical data model that has not yet been forward engineered to a new physical model; in this situation, the information needed by the DBA to update the physical data design is stored in the
logical data model - a draft physical model that is the result of forward engineering logical data to physical data in a data modeling tool; in this case the information needed by the DBA is stored in the physical
data model - a completed physical data model that has been transformed by the data modeler into data definition language (DDL); this information can be executed by the DBA to physically change the database
design.
This interaction between the modeler and the DBA takes place during the design and development phase of the data life cycle, or when changes to the data model have been requested by the application
developers or the business analysts responsible for defining data requirements.
When the logical data model is changed before the database, the physical data model and database catalog need to be changed to reflect the changes. Ideally, steps need to be taken to bring the
physical database definition up to date with the changes to logical model and database design. In some shops processes are clearly defined for where the modeler’s responsibility ends and the
DBA’s responsibility takes over. This process, although though not stated as such, is the passing of meta data from the modeler to the DBA.
Changes Made by the DBAs First
In an IT environment where changes in database structure are first given to the DBAs (exactly the opposite of the above scenario), there needs to be a process for the DBAs to communicate the
changes to the data modelers (preferable before the DBAs make their changes). In this environment, the information that is passed from the DBA to the data modeler typically takes on one of two
forms:
a memo or a scratch piece of paper indicating what changes need to be made
the physical structure of the database through the catalog or data definition language
Reverse-engineering (as it relates to data modeling) is a term used to describe the movement of meta data from the physical database / catalog towards the (re-) generation of the physical and
logical data models. Tools are available that allow DBAs to hand their changes back to the data modelers, however, the process of reverse-engineering database meta data to become data model meta
data can become a manual and time intensive process. Companies use this type of functionality to model existing databases that did not originate with a model. Companies seldom perform this type of
task on an on-going basis to keep data models and databases in synch.
The DBA and the Application Developer
Application Developers are very creative people (and this comes from a former COBOL and MUMPs developer). Harkening back to my developer days, when a glitch became apparent in a program, screen,
batch script, … the developer’s first reaction was to solve the problem by fixing the program so it operates “properly” or “stops blowing up”. The changes could be as simple as adding
another “if-then” statement that allowed for a new value to pass from one variable to another. Little thought was given to the validity of the new value or the impact of the additional value on
the rest of the application or business. This was done over and over again, and process is still intact with many “patchwork” legacy applications that exist today. In those days there was seldom
a methodology for making a change.
If we look closely at the application development methodologies (ADM) in companies today, we find first that many companies don’t have a singular ADM for the entire company or even a fraction of
the company. One of the key strengths of the ADM is that it nary leaves a stone unturned. In a well-followed ADM it is documented where one individual’s responsibility ends and the next person’s
responsibility starts. When information is passed from one individual to another, this information is meta data. Without an ADM, there is not a defined procedure for passing meta data between
developers and DBAs.
To (over-) simplify matters, lets look at a very basic example:
When the Application Development Manager tells the DBA to add an “exception indicator” to the “customer transaction” table, the problems start. If the DBA adds the column without delay the
Manager is happy and the DBA did what he / she was told. If the DBA tells the Manager to see the Data Administration group so that the field can be defined, dissected, and turned inside-out, the
change may be a week or weeks or it may not take place at all. Is anybody happy? Is this a good thing or a bad thing? The answer is … it depends (on the validity of the requirement, the
definition, the business rule, …).
Without a “structured” method to communicate changes between the DBAs and the application developers, the top-down design can quickly become an “upside-down and side-to-side” application.
Improved information and information flow about the application and the data can improve the understanding and the quality of both.
When the application developers and DBAs are on the same page, the DBAs are not viewed as “quick change artists”. The DBAs are viewed as a strategic link in the chain that consists of business
interests and application development. If DBAs and developers exchange and document quality information (in the form of meta data), the potential for success becomes much greater.
The DBA and the Business / Data Analyst
Back to the example of the “exception code” change to the “transaction table”:
Lets say that the “exception code” was absolutely necessary, the DBA told the modelers to change the data model, the modelers forward-engineered the logical model to a physical model, DDL is
generated from the physical model, and the DBA executes the DDL to add the column to the “customer transaction” table. The meta data exchange process between the developer, the DBA, and the
modeler was followed to precise detail.
BUT … What about the business and data analysts that use the data? Do they know what the “exception” is or when the “exception” became active? Do they know anything other than the fact
that a new column called “exception code” (or EXCPT_CD) appears on the table that they use? Do they understand the impact that the column may have on the report that is sent to the COO every
quarter, and the one sent to the CFO every week? Do they know where to get this information? Or lets say that the business or data analyst doesn’t even know that the column was added?
Business and data analysts are always looking for information about the data. The more they know about the data, the better off they are. This information ranges from data definition and valid
values, to business rules relating subjects and entities of data, to databases, tables, views, indexes, columns, … much of this information about the data comes from the database catalog.
The DBAs, by default, create the meta data source that supplies much of this information to the business and data analysts.
DBAs also provide the business and data analysts with other information not mentioned here. DBAs provide support to the data user in the form connectivity, authorization, performance, and on and
on.
However, most often, the meta data in the DBMS catalog that will be important to individuals other than the DBAs will include information about the database, tables, views, columns, and indexes.
There is a tremendous amount of meta data in the DBMS catalog, but when individuals are looking to use the data, the physical characteristics listed [here] cover their most basic needs.
(1)
Typically, it is not the DBA’s responsibility to inform the business and data analysts of the information that they need to do their jobs. The DBAs are responsible for all of the physical aspects
of the databases. However, it has to be someone’s responsibility to prevent occurrences like the one caused by the “exception code” field. If the DBA is the front-line, they can prevent
occurrences like this from happening.
DBAs Do Need Meta data
This article has given examples of how DBAs play a key role in meta data management, with or without them being aware of it. BUT … Do DBAs really have all the meta data that they need in
their DBMS catalogs? The real answer to that question is a resounding “NO!”
Mike Yocca, of Ace Database (Pittsburgh – www.acedb.com) stated it well when he labeled his search for information about the databases to be “the churn”. Mike
stated that “in the day of increased complexity of databases and database servers, my customers spend a tremendous amount of time researching and identifying IP addresses, server info, contact
info, business rules, user IDs and passwords, and much more. DBAs certainly DO NOT have all of the meta data that they need.”
Conclusion
Don’t go running to the DBAs in hopes that they will lead the charge in support of your meta data project. DBAs understand the importance of meta data because they use it all of the time. They
have a database full of meta data that they use on a regular basis (which is more than can be said for data modelers, application developers, or business / data analysts) but they might not have
thought of themselves as being a key player in meta data success for the corporation. Hopefully, this article will help them to think otherwise.
(1) Meta Data Themes: The Basics – Robert S. Seiner – The Data Administration Newsletter, June 1998