Imagine what it would be like if there was an inexpensive one-size-fits-all solution to the management of meta data. Imagine what it would be like if it was easy to move meta data from one tool or
platform to another and that tool vendors collaborated on standard cross-product meta models. Imagine meta data as an integral part of the business intelligence offering in your company. Imagine if
pigs could fly.
I can picture a lot of data architects and repository administrators are nodding their heads in unison. These individuals are most likely to know what a “dream” it would be to have such an
integrated IT environment. The integrated environment would make it much easier to improve data quality services, improve data standard compliance, reduce data redundancy, and offer a higher level
of understanding about corporate data. A dream; Webster’s dictionary calls it a “wild fancy or hope”. The scenario described in the first paragraph is a dream. It is not reality. Not yet. The
time may be coming but it is not around the corner.
What can we do to make the best of the current situation? Should we wait around for the few vendors with resources to provide solutions to battle out the issues and pick a single standard? This may
take more time than we have (or it may never happen). We have to act now.
The purpose of this two part article is to demonstrate, in easy-to-understand terms, a manageable way of selecting the meta data to manage that will prove it’s value and build a foundation for
future data and meta data management efforts. The goal of the first article will be to discuss basic themes of meta data. The second article will discuss advanced themes of meta data.
Getting the Ball Rolling
For companies that I have worked with, getting started is one of the hardest parts of the meta data project. Companies that require return on investment in hard dollars for meta data management
often have a difficult time cost justifying the need for an enterprise meta data framework or strategy approach. It is possible to apply dollars to the ROI (through reduced
unproductive work, rework, research time, incorrect decisions, and more) however, these dollars are often viewed as inflated and become based on assumptions that the companies will change the way
they manage their data. Companies that require hard ROI are fully justified in wanting to know what they are getting for their money. My point is that sometimes it becomes a hurdle that can become
insurmountable.
Many companies that move forward with meta data management projects understand the need to improve data resource knowledge and do not required a defined hard dollar return on investment. These
companies believe that leveraging other investments (data warehousing, package implementations, system development, e-commerce, …) is only possible through improving
the understanding and management of data through meta data.
Neither of these organizations are wrong. In the first type of organization, getting the ball rolling takes more time and, in the day of reduced budgets, becomes more and more difficult. In these
companies, the data architects and data resource management staff hope that management will allow them to improve the organizations understanding of their data. These same individuals hope that
their company will not wake up years behind their competition.
For the companies that require ROI definition all is not lost. There are things that can be done in small tightly focused projects that do not require a substantial amount of resources.
Enterprise meta data does not need to be delivered all at once. In fact, it is difficult (if not impossible) to deliver enterprise meta data all at once. Therefore the first step of the project
should be to understand the big picture of enterprise meta data management as a back drop for smaller and tightly focused meta data projects. Get the ball rolling, show value and move on from
there.
Themes in Major Projects
Which of the following projects are significant projects at your company?
- Reducing data redundancy?
- Delivering just-in-time decision support data?
- Supporting knowledge workers in their use of business intelligence data?
- Solving the Year 2000 problem?
- Delivering an integrated data solution by using an enterprise data model?
- Implementing a third-party package (SAP, PeopleSoft, Oracle Fins, …)?
You say … all of these are important. Certainly these projects are important to a large number of companies that apply a large portion of their IT budget to these types of efforts.
It is unreasonable to believe that companies will fund a project or apply resources to a project to address the meta data concerns of all of the projects listed above right out of the gate. For all
of the projects above, meta data can be a key component to success and sustainability.
So … How do we select which meta data is the right meta data to manage as it pertains to the most important projects in our company? By looking for themes of meta data. When analyzing the
basic meta data that is important to each of these projects, we uncover three underlying meta data themes.
Meta Data Themes – Basic
Much of the meta data for the types of projects defined above are based on similar themes. These themes are:
- The logical description of data through data modeling.
- The physical structure of the data in the DBMS, catalog or not.
- The dynamics of the data; mapping, movement and transformation
Purists may tell you that there is a lot more to meta data than just these three themes. This is true. However, for the purposes of this first article, I will discuss the extremely basic themes of
meta data and I will focus on the items listed above and defined below. The basic meta data themes offer you an easy-to-understand starting point for meta data management.
The table below demonstrates the types of meta data that are related to the basic themes:
Categories of Meta Data (Themes) | Samples of Meta Data Types |
Data Model Meta Data | Data Models, Entities, Attributes, Domains, Value Tables, Allowable Values, Keys, Partnerships, … |
Physical Database Meta Data |
Databases, Tables, Copybooks, Columns, Elements, Indexes, … |
Data Movement Meta Data | Source Element in Context, Target Element in Context, Transformation Logic, Transformation Type, Mapping Type, … |
Figure 1 – Basic Meta Data Themes
Data Model Meta Data
Many projects start with the business definition of data through the use of a data modeling tool or tools. There are a variety of tools available for project data modeling and enterprise data
modeling. By analyzing the meta models of several modeling tools in established repository environments, common objects appear that define the basis of data modeling meta data.
Common to most modeling products and projects are the definition of business entities, business attributes, domains, allowable values, and the key attributes and business rules that are used to
relate entities to each other.
There are other types of meta data that are important to the modelers and the modeling tool. However, when looking at how the data modeling meta data would likely be used away from the tool, the
items listed in the table above will most likely answer the needs of the business and technical meta data users.
When business data models are forward engineered to physical database designs, additional meta data is captured in the modeling tool (mapping of the logical model to the physical database). This
information will be used to relate the business definition of the data to the physical environment.
A basic diagram of the modeling meta data discussed above is shown in Figure 2.
Figure 2. – Basic Data Model Meta Data
Physical Database Meta Data
If you are using a relational database management system such as a DB2, Oracle, Sybase, SQL Server or Informix, the DBMS catalog serves as a meta data repository for your database environment. All
of the information found within the catalog is meta data. Not all of this information is useful to individuals who are not database administrators. DBAs often have direct access to the physical
database meta data through database management tools. In that case, DBAs may tell you that they do not require integrated meta data to help them with their jobs. They already have access to the
meta data that they need to perform their jobs.
Most likely, the meta data in the DBMS catalog that will be important to individuals other than the DBAs will include information about the database, tables, views, columns, and indexes. There is a
tremendous amount of meta data in the DBMS catalog, but when individuals are looking to use the data, the physical characteristics listed in this paragraph cover their most basic needs.
Legacy applications often store data in files defined by copybook members (used to define the data in flat files, VSAM files, and IMS segments). Copybook members do not contain the level of detail
about the physical data as the DBMS catalog. But, since data is still stored that way (and will continue to be stored that way), it is important to consider these types of physical data definition
in the basic themes of meta data.
The types of meta data found in copybook members include copybook names, record names, field names, group names, and the structure of the data (i.e. picture clauses).
A basic diagram of the physical database meta data is shown in Figure 3.
Figure 3. – Physical Database Meta Data
Data Movement Meta Data
Data does not stand still. Data moves from business unit to unit, function to function, and database to database. Data is, perhaps, the most dynamic asset of the company. As data is recorded and
processed, mappings and transformations take place that determine how data should be interpreted.
In many companies, during data movement processes developed for package implementation and/or decision support environments, mapping and transformation meta data is often captured in data
transformation tools that automate the data movement process. The information about how the data in the package or warehouse was created is important to the warehouse builders. BUT …that
meta data can also be important to business users who’s job it is to interpret the data.
Companies that don’t use data movement and transformation tools should pay special attention to documenting how data is selected, extracted, mapped, moved, and transformed. This information can be
typically (but not easily) be found in the meta data that exists 1) in the source code that is written to perform these functions or 2) in external forms of documentation such as word processing
documents, spread sheets, and personal databases. Either way, the data movement meta data becomes important for the proper interpretation and use of the data.
A basic diagram of data movement meta data is shown in Figure 4.
Figure 4. – Data Movement Meta Data
Ways to Use Basic Theme Meta Data
To this point we have defined three basic themes for meta data management: Data Modeling Meta Data, Physical Database Meta Data, and Data Movement Meta Data. The meta data gained from these three
pieces of the data architecture, can be used to answer a large number of questions about the IT of your organization.
Doing a good job of managing the basic meta data is not an easy task. There are many considerations to keep in mind when implementing meta data management for the three basic themes. Identifying
the meta data, selecting the meta data, mapping the meta data to a common target, moving the meta data to a centralized repository (or separate data store) and keeping the meta data up-to-date.
However, the hard work can pay off if meta data is available to answer specific questions about the corporations data assets. A list of typical questions appears below:
- What business entities of data exists in the organization, warehouse, mart?
- What physical tables represent each business entity?
- Where does the data and information about the business entities exist?
- What are the business attributes that make up each the business entities?
- What are the business definitions of the business entities? attributes?
- Through what business rules are my business entities relate?
- How does my business view of my data relate to the physical databases?
- What are the valid values (or ranges) permitted for each attribute of data?
- What physical elements represent each attribute of data?
- Where did the data originate for my data warehouse, 3rd party package?
- What mapping and transformation took place before the data arrived in the data warehouse, 3rd party application?
Conclusion
This article addressed the first step of meta data management; the identification of basic meta data that can be helpful in many organizations. By managing the meta data in the data modeling tools,
physical database environment, and the mapping and transformation of data, organizations will be better prepared to deliver information about the projects and data that are the most important to
the organization.
The second article will discuss advanced meta data themes that include data access, data quality meta data, business rules meta data, and information accountability meta data. The advanced meta
data, in conjunction with the basic meta data, provides an easy-to-understand mechanism for moving forward with enterprise meta data management.