Top 10 Mistakes to Avoid When Developing a Metadata Repository

Published in January 2000

Building a metadata repository is critical for accessing, maintaining, and controlling the vital information stored in our decision support systems (DSS). While metadata has always been a central
covenant of data warehousing, over the last couple of years it has been brought further into the spotlight as most Global 2000 companies have some sort of decision support system currently in
place, most for several years. The vast majority of these companies have had to struggle with the task of managing the exponential growth of these systems over time. Without metadata the task of
managing this growth becomes overly difficult and time consuming. This need has driven many major software vendors like Microsoft, Platinum (Now Computer Associates), Oracle, and IBM to enter the
metadata marketplace with significant product offerings.

Now before you run out and start implementing a repository, it is important to note that up to this point too few companies have successfully implemented a metadata repository that serves the needs
of their business and technical users. This article presents the top ten mistakes to avoid when developing a metadata repository.

1. Not defining the tangible business and technical objectives of the metadata repository

This is THE top mistake that most companies make. Quite often the metadata repository team will neglect to clearly define the specific business and technical value that their metadata repository
will provide. These objectives are critical to define up front as they will guide all proceeding project activity.

Clear business and technical objectives are definable and measurable. This activity is imperative since once the metadata repository is completed the management team will have to justify the cost
expenditures of the initiative. Keep in mind a metadata repository, like a data warehouse is NOT a project, it is a process. The repository will need to grow to support the ever-expanding role of
the data warehouse/data marts and operational systems that it supports. In addition, as business users become more sophisticated their demands will substantially increase. Once a cost justification
can be quantified for the initial release of the repository, the process for gaining funding for the follow up releases is greatly simplified.

2. Examining metadata tools before defining requirements

It is surprising how often I receive calls from companies requesting me to suggest a metadata tool for their repository project. My standard response is “what are your repository’s
requirements”? Typically the reply from the other line is silence. This situation is highly concerning. The metadata repository requirements must guide the tool selection process, not precede it.

As we discussed, clear requirements for the metadata project are critical, as they provide the lighthouse for all subsequent project activities. Without this beacon it becomes all too probable for
the project’s course to go awry.

3. Selecting a metadata tool without conducting an evaluation

All of the major metadata vendor tools maintain and control the repository in a different manner. Finding the tool that bests suites your company requires careful analysis. An educated consumer
will be the most satisfied one because they understand exactly what they’re buying and what they’re not buying.

Remember whichever tool is purchased none of them make metadata integration “easy”, regardless of the marketing materials or salesperson’s hype. To be successful in your metadata project it
takes knowledge, discipline, talented employees, and good old fashion hard work, just like an other major IT endeavor. While none of the tools eliminate these needs, for most companies it is far
better to purchase a tool and work around its limitations, as oppose to building everything from scratch.

4. Not creating a metadata Repository team

Very often companies neglect forming a dedicated metadata repository team. This team will be responsible for maintaining, controlling, and providing access into the metadata repository. The typical
metadata repository team, at full staff will consist of 1 – 2 data modelers, 2 metadata integration developers, 2 metadata access developers, 1 – 2 business analysis, metadata
repository architect, and a project leader. Keep in mind some of the roles can be fulfilled by the same resource, depending on the size and schedule of the effort.

In addition, it is important for the metadata repository project leader to report to the same person as the head of the decision support system team does. This creates a peer-level relationship
between the metadata repository and the decision support team leaders. The metadata repository team and the decision support team must work together as both of there work directly impacts the
other. A flawed or muddled data warehouse architecture will directly impact the quality of the metadata repository. Conversely, a poorly designed repository will greatly reduce the effectiveness of
the decision support system.

5. Having too many manual processes in the metadata integration architecture

The process for loading and maintaining the metadata repository needs to be as automated as is possible. Less than successful metadata implementations will typically contain far too many manual
processes in their integration architectures. The task of manually keying in metadata becomes much to time consuming for the metadata repository team. With careful analysis and some development
effort the vast majority of these manual processes can be removed.

Very often much of the business metadata will require some sort of manual activity just to capture the information. Additional processes will most likely need to be developed to allow the business
leaders and analysts to modify the business metadata. Unfortunately, some companies manually key in a great deal of their business metadata, which makes the repository non-scalable and impossible
to maintain over time.

6. The metadata repository is difficult to access

A key goal for all metadata repository projects must be to provide open access to the metadata to any and all business and technical users, with minimal to no effort required from them. Many of the
earlier metadata repository efforts did a decent job of integrating valuable metadata, however these efforts got sidetracked as the metadata was never rolled out to the users. Users were required
to go to the metadata repository team to “beg” for the information that they needed. Needless to say this technique meets with no success.

7. Letting the metadata tool vendors manage your project

All too often companies will want the metadata integration tool vendor to manage and implement their repository project. This is a critical mistake as these vendors tend to be highly tool focused,
as they rightfully should be. While the metadata integration tool is at the heart of the metadata process, it takes a lot more than a tool to create a fully functional repository. In addition,
typical software vendor consulting staff are not true integrators, instead they are tool experts which is what they should be focused on.

8. Not having an experienced metadata project manager/architect leading the project

An experienced metadata repository project leader keeps the vision of the project in concert with the real-world reality of metadata and decision support. In addition, the architecture of the
repository must be scalable, robust, and maintainable so that is can accommodate the expanding, and changing DSS and metadata requirements. These fundamental challenges require a highly
experienced, senior level individual.

If a consultant is used to initially get the project up and running it is imperative that the person is highly skilled at knowledge transfer and that an in-house employee has been assigned to
shadow the consultant from the onset of the project. Be wary of consultants without real-world, hands on experience. It’s one thing to be able to write or speak about metadata; it’s
entirely something else to have the experience needed to navigate through the political quagmires and the knowledge of what it takes to physically build a metadata repository.

9. Trivializing the metadata repository effort

All to often companies do not realize the amount of work it takes to build a metadata repository. Everything a company needs to do to build a data warehouse they need to do to build a metadata
repository. These tasks include defining business/technical requirements, data modeling, source system analysis, source data extraction/capture, source data transformation, data cleansing, data
loading, and end user access. To increase the likelihood of the project’s success it is best to develop the metadata repository iteratively as oppose to building everything all at once.
However, when doing a project iteratively you must have the end result in mind at all times, as it will be your guiding wind.

It is important not to overlook the political challenges of the metadata effort. Politics cause the best-planned metadata and decision support projects to go astray. Remember cooperation will be
needed from multiple team IT (information technology) and business teams to support the metadata effort.

10. The metadata Repository team creates standards none of the supporting teams can follow

In order to capture much of the key business and technical metadata the metadata repository team will need to develop standards that the decision support team and business users can easily follow.
Quite often the metadata repository team makes the processes and procedures for following these standards far too complex and tedious. When this situation occurs, the metadata repository team
becomes viewed as a bottleneck to the decision support development process. At this point, normally it is a matter of time before the metadata repository team is disbanded. To prevent this
situation from occurring make sure to keep all processes and procedures simple and easy to follow. In addition, keep the amount of time needed to complete them to a minimum and do not neglect to
create a feedback loop so that the other teams can let you know how you’re doing.

11. Contractually obligate the metadata software vendor to provide a named architect

It’s impossible to encapsulate all of the metadata repository and metadata traps in a series of only 10 points, so here is a bonus tip.

As with all software, metadata integration tools come with high learning curves. These learning curves have sunk more then one project. What will greatly reduce your risk of failure is to have a
person that understands how to architect with the specific metadata integration tool. On the other hand, be prepared to pay a hefty fee for this person. Top of the line people are in high demand
and more than make up for the investment in the amount of time and pain they can save. As a result, before the software is purchased I interview the proposed architect and we make that
person’s time a condition of the sale.


submit to reddit

About David Marco

Mr. Marco is an internationally recognized expert in the fields of enterprise architecture, data warehousing and business intelligence, and is the world’s foremost authority on metadata.  Mr. Marco is the author of several widely acclaimed books including “Universal Meta Data Models” and “Building and Managing the Meta Data Repository: A Full Life-Cycle Guide”.  Mr. Marco has taught at the University of Chicago, DePaul University, and in 2004 he was selected to the prestigious Crain’s Chicago Business "Top 40 Under 40". He is the founder and President of EWSolutions, a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class business intelligence solutions using data warehousing, enterprise architecture and managed metadata environment technologies (