The Role of Data Dictionary in a Package Environment

Published in TDAN.com January 2006


BACKGROUND

IT leaders from enterprises around the world have realized that acquiring and implementing commercial off-the-shelf (COTS) application packages is a smarter thing to do, than the traditional
approach of developing home-grown applications. This may not be the fit-for-all solution, but it makes sense from a cost, time to market and resource standpoint. Selection and acquisition of the
software package is just the first step. How the software package brings value through successful implementation, is critical to the business. Internal groups as well as industry analysts will
closely watch COTS package implementation in the organization.

Implementation success depends heavily on how well the staff is educated with the design of the COTS product such as ERP, CRM or SFA packages. A good, accurate and detailed Data Dictionary goes a
long way in helping the implementation staff make proficient in use of the data design aspect of the software package. This article presents numerous ways this precious data can be put to work to
make the most of it for the organization.

Before we jump in to the subject, let’s take a look at the role of the Data Dictionary (DD) in the software evaluation process. DD should be treated as a critical component of the software package
that a vendor provides. If you have a methodology for software and vendor evaluation, assign some weight to the availability of a DD as well. Generally, there should not be any cost associated with
this piece of documentation. Also look at the format of the DD the vendor provides. DD in document format (I hope soft copy) has less value than a DD in relational format (most preferred). Insist
on having the DD with entity (table) and attribute (column) descriptions.


IMPLEMENTATION

Having the DD for the software package is the first step. Making full use of it is the key. Follows several ways this information can be leveraged:

Formatting: The first step towards DD usage should be getting the data converted to convenient and useful format. On many occasions vendors provide data in their proprietary format such as
hard copy or simply a spreadsheet. Convert the DD into a comma delimited text file. Load it into a relational database for subsequent use. This will serve as the foundation for rest of the work
products based on DD.

Cleansing: Print reports from the DD and check for completeness and quality of the data. Develop and execute scripts to cleanse the DD data. Also look for opportunities to make it
‘prettier’ (case, punctuation, etc. in description fields).

Grouping: Generally, COTS products are organized into numerous modules. Organize the DD data with the same convention so that it gets better reporting capability. Add descriptions for the
modules as well. The idea here is to divide and conquer. Since the DD can be thousands of records in relational format.

Metadata Repository: If you have a central metadata repository, load this data in to the repository. This will make sure a single official version is made available across the departments
within IT. Also establish a process for refreshing the DD data in the repository. (e.g. refresh DD with every release cycle)

Training: Create training material using the DD to educate your developers, database administrators, data integrators and functional staff. DD can be forward engineered into MS Word, MS
Power Point or other preferred formats at this point.

Help Files: The DD can also be used for building help files. These help files can be installed on users desktops for quick reference.

Information Portal: DD will play a major role in providing help in information navigation through the organization’s information/BI portal. Make DD accessible through the portal so that
users can perform searches and get to the right data for reporting.

Extending the DD: The core concept of the DD can be extended beyond the package to downstream ODS, Data Marts and Data Warehouse tables. Capture descriptions of these tables (and columns)
into the same DD repository. This additional information should also be made available through the portal.

Integrating with Databases: Tools such as Oracle Discoverer rely on the COMMENTS (an Oracle database data dictionary column that holds description of the table and column) for tables and
columns for displaying the descriptions to the users. If you have an Oracle database, consider developing scripts to populate the COMMENTS in the Oracle dictionary using the DD.

Glossary: When developing a glossary for the system, integrate DD information into it for reference. For example, “Account Balance: Is the balance of ….. . This data is stored in
following tables-a,b,c…”

Data Access Control: Organizations are required to protect their customer data. Classifying and labeling the data fields is imperative. Extend descriptions in the DD for security
information attached to the tables and columns. The security information will help your database administrators understand sensitivity of the data elements.

Impact Analysis: Impact analysis is done by DBAs when planning database upgrades, data exports etc. DBAs can perform impact analysis with the DD repository. This functionality will be
appreciated, especially when the size of the package in terms of tables runs into thousands.


CONCLUSION

Now you can see just how far the investment in a good, accurate and detailed Data Dictionary can go in a COTS package environment. Return on investment will be high for large scale COTS packages.
And business users will thank you for taking this creative approach to socialize the data design knowledge base across teams.

Share

submit to reddit

About Satyajeet Dhumne

Satyajeet is an experienced consultant in the fields of data warehousing, business intelligence and data management. He has more than 23 years of experience in the information technology industry, and for the past 12 years he has focused on business intelligence, data warehousing and data architecture. Satyajeet holds M.S. in Management of Information Technology from McIntire School of Commerce at University of Virginia. You may reach Satyajeet via email at sgdhumne@yahoo.com.

  • chit ruiz

    one vendor says they cannot share data dictionary for proprietary reasons. if not in the SLA, is the statement valid? what remedy can the licensee make? thanks!

Top