This article is adapted from the book “Universal Meta Data Models” by David Marco & Michael Jennings, John Wiley & Sons
This article is the third installment of an ongoing series of articles on the Data Stewardship Framework. In this installment we will examine the most common data stewardship activities. Visit
Part One & Part Two.
Data Stewardship Activities
The specific activities of the data stewardship committee will vary from one organization to another and from industry to industry. Having worked with several data stewardship committees I can
attest that no two are exactly the same. However there are some common activities that I see these groups working on. Below are the most common data stewardship activities:
- Data Domain Values
- Data Quality Rules, Validation and Resolution
- Business Rules/Security Requirements
- Business Meta Data Definitions
- Technical Data Definitions
The data stewards that work on these different activities will be primarily working with meta data; however, there will be occasions when they may need to work with actual data. During the
following section I will walk through these common data stewardship activities and provide a set of guidelines and best practices around them. Also, I will discuss the typical data stewards that
accomplish these tasks. It is important to note that sometimes there are people that are highly knowledgeable on the data and the business policies around the data, even though they do not belong
to the particular stewardship group that I mention for the activity. For example in your company there may be some technical stewards that are as knowledgeable on the business policies and data
values as any of the “official” business stewards. So even though I state in my guidelines that the business stewards should be creating the business meta data definitions, obviously you would
want these technical stewards working with the business stewards to define the business meta data definitions.
Also for all of the activities that I will discuss the chief steward will play a critical role in ensuring that these activities are properly completed. A good chief steward will make sure that the
technical and business stewards work thoroughly and expediently. The chief steward understands how easy it is for these groups to fall into “analysis paralysis”. The chief steward will act as a
project manager/guide in each of these activities. Most importantly the chief steward will aid these people in any needed (and you will need it) conflict resolution for the different data
stewardship tasks.
Data Domain Values
Once the business stewards have defined their key data attributes, they need to define the domain values for those attributes. For example, if one of the attributes is state code. The valid domain
values would be the two character abbreviations of the states (e.g. CA, FL, IL, NY, etc.).
As with all of the data stewardship tasks this meta data will be stored in the meta data repository. It is highly recommended that a web-based front-end be developed so that the business stewards
can easily key in this vital meta data.
In many cases data modelers have inputted attribute domain values into their modeling tool (e.g. Erwin, Silverrun, System Architect). If this process has occurred in your company you can create a
process to export that meta data from the modeling tool and into the meta data repository. This will allow the business steward a good starting point in their process to enter domain values.
Data Quality Rules, Validation and Resolution
Data quality is the joint responsibility of both the business and technical stewards. It is the responsibility of the business steward to define data quality thresholds and data error criteria. For
example, the data quality threshold for customer records that error during a data warehouse load process may be 2%. Therefore if the percentage of customer records in error is greater than 2% than
the data warehouse load run is automatically stopped. An additional rule can be included that states if the records in error is 1% or greater but less than 2% than an warning message Is triggered
to the data warehouse staff; however, the data warehouse run is allowed to proceed. An example of data error criteria would have a rule defined for the HOME_LOAN_AMT field. This rule would state
that the allowable values for the HOME_LOAN_AMT field is any numeric value between $0 – $3,000,000.
It is the responsibility of the technical stewards to make sure the implementation of the data quality rules are adhered to. In addition the technical stewards will look to work with the business
stewards on the specific data quality threshold and data error criteria.
Business Rules/Security Requirements
Business rules are some of the most critical meta data within an organization. Business rules describes how the business operates with its data. A business rules will describe how the data values
have been derived, calculated, if the field relates (cardinality) to other fields, data usage rules/regulations and any security requirements/limitations around a particular entity or attribute
within your company’s systems. For example, a healthcare insurance company may have a field called “POLICY_TREATMENTS”. This field may list out the specific medical treatments that a policy
holder may have undergone. The business rule for this field could be that it is a alphanumeric, 20 byte field, whose “system of record” is from “System A”. In addition, there maybe security
requirements on this field. Most health insurance company’s provide coverage to their own employees. Therefore the security requirements for this field maybe that the IT department is not allowed
to view this field or associate it with any fields that would identify the policy holder. When security rules like these are broken the corporation experiences a great deal of legal exposure.
In my next column I will complete walking through the specific activities of a data stewardship committee.
© 2005 EWSolutions All Rights Reserved