Business Metadata: How to Write Definitions

Published in TDAN.com April 2005


Introduction: The Importance of Definitions

Many errors and accidents are made/caused by misunderstandings of the meanings of terms used.

How many times have you been in a meeting when the words you heard being said did not match what you thought they were?

Many business decisions are made (and later regretted) due to a misunderstanding of the data, and what the data element used in a report is signifying. Some of these accidents and misunderstandings
are large enough to be reported in the media. In prior papers I refer to the Mars Lander episode, where the unit of measure was assumed and not made explicit, miscalculations were made and the
equipment was lost. Our businesses are filled with many such examples, although not as costly perhaps, are still quite impactful to the business.

Context is everything. The English language is full of meaning nuances; a word may have multiple meanings based upon the context that it is used.

Business metadata is all about adding context to data. A Dictionary or Glossary is part of business metadata, and it is all about making meaning explicit and providing definitions to business
terms, data elements, acronyms and abbreviations. This article is about how to write a good definition. Future articles will be about how to set up a dictionary that will be used.


Definition of a Definition

What is the definition of a definition?

The short form might be: the meaning of a term, formally stated. (From InfoDeveloper toolkit- http://saulcarliner.home.att.net/id/definitions.htm )

Usage note: the term “Term” refers to either a word or phrase that has a definite meaning to the business and is significant enough to be managed by the business both in a glossary and to store
data values concerning its occurrence in the real world.


Components of a Definition

The usage of a Controlled Vocabulary (CV) helps software manage glossaries and can also empower enterprise search capabilities. CV acronyms and specialized terms are indicated in italics and
parentheses.

Here are the components of a well-written definition. The actual definition text is comprised of items 3, 4 and 5 below.

  1. The name of the term being defined.
  2. Part of speech (optional; can be helpful). Examples: noun, verb.
  3. Broader term (BT): general class to which the thing belongs; sometimes this is implied. In object modeling parlance this is called an “IS-A” relationship. Example: “A spoon is
    a utensil”. Note that some definitions may be explaining things in the past, which would be “WAS-A” instead of “IS-A”.
  4. Distinguishing Characteristics, otherwise known as pertinent attributes with specific values. In object modeling parlance this is called a “HAS-A” relationship. Example: “A
    spoon has a small bowl attached to the end”. Note that some definitions may be explaining things in the past, which would be “HAD-A” instead of “HAS-A”.
  5. Function Qualifier, describing how the thing being defined is used; this usually involves one or more verbs. I would extend the CV structure to include USED-FOR, but this term is
    not to be confused with USE-FOR, described below. As in the last two components, the Function Qualifier may be describing something in the past, but USE-FOR is already taken, so past may have to
    be implied or made explicit in the text.
  6. Narrower Term (NT) refers to the classes below the term being defined. For example, if the term being defined is Spoon, then pertinent narrow terms of interest could be Soup,
    Tea, Serving.
  7. Related term (RT) refers to a term that has relevance to the term being defined but is not a synonym. For example: Can opener is related to can but not a synonym. Dictionaries,
    indexes and search engines often have a “SEE ALSO” section that lists RTs.
  8. Synonyms, or terms that mean nearly the same thing as the term being defined. CVs often handle synonyms using a synonym ring, defining one term as the (PT) or Preferred Term.
    Glossaries, indexes and search engines display synonyms in the “SEE…” section.
  9. Examples. Example of the term; an instance of the term as it is seen in everyday life. Example (of the example!): An example of an employee is Mary Jones.
  10. Usage refers to using the term in a sentence. The example can incorporate sentence usage, but it doesn’t have to. Example: A Spoon is defined as an eating utensil that has a
    small bowl at the end. Usage: Mary gracefully lifted her spoon to her mouth to sample the soup.
  11. Source refers to where the definition came from. If it came from a document or manual, pertinent reference information may be important (date of the document, author, etc.)
  12. Dates may be important. Create, modify date should always be recorded which track inserts and changes to the glossary. In addition, sometimes dates that indicate the validity of
    the definition if governance is used may be necessary such as Effective Date and Expiry date.
  13. Replaced by: Sometimes you may want to keep “legacy” terms in your glossary, especially if re-engineering has occurred or a migration to a different system that uses different
    terms. You may want to indicate that the term is legacy and has been replaced by some other term. Alternatively, you can use Synonyms, but Replaced by indicates you should not use the legacy term
    in common usage.
  14. Approval information can be added to track the governance trail, for such things as when the definition was approved, by whom, etc.


Definition Usage Notes


Definition Text Structure

As noted above, the three major parts of the definition text are indicated in 3,4 and 5 above. They are:

  • IS-A (class)
  • HAS-A (attribute discrimination)
  • USED-FOR (function)

A good, sound definition must make explicit two out of three of these components. The following example incorporates A and C:

A:

A sleeve is a part of a shirt that

B:

goes over the arm.

Note that “a part of” indicates the broader term or class: Shirt. “Goes over the arm” illustrates a distinguishing characteristic that is a function or use of the sleeve.

PART-OF and TYPE-OF are terms that indicate a broader class relationship. Often the class relationship denoted by IS-A is implied. Sometimes these types of implications can be important to make
explicit, sometimes not.


Enumerated, Multiple Meanings

Sometimes a definition can include more than one meaning. This is common in a typical dictionary, and our language is full of such cases. The meaning that the term is used must be derived from the
context of the sentence. The multiple meanings are enumerated in a definition description as follows:

There are search tools that prompt the user with “DO you mean…” and lists each possibility corresponding to the different definitions, and assist the user to direct the search.


Broader/Narrower

BT and NT can get tricky. Sometimes a broader term or class is implied, and not important to the scope of the CV. For example, the formal definition of “definition” might be “a group of words
that expresses the meaning of a term.” A rule of thumb is that all terms used in definition text should be defined in the glossary. However, it is possible to get into a “chicken & egg”
thing with either BT’s or terms used in the definition. Do you really need to define the broader term “word”?

My answer to this is I would consider certain terms to be “atomic” and well-understood. I am defining “well understood” to mean that the term is in common everyday language, general usage (in
general settings and not just in a unique industry) and therefore does not have to be defined. However, be careful! Choose these “well-understood” terms carefully and make sure they really are
well-understood. An example of a well-understood term might be “Person”. It generally refers to a homo sapien carbon-based life form. However, an example of a poorly understood term is
“Customer”. You think you know what it means, but do you really? “Customer” should always be explicitly defined for every business.

You obviously get to the place where there are classes (generalizations) that are not useful, and are certainly not worth the effort of defining. As always, the Law of Diminishing Returns applies.


Miscellaneous Guidelines

1. A definition should never be a tautology, i.e. defined by itself. For example: “A unicorn is a beast with one horn” tells us nothing because the word unicorn is a compound word, and its parts
are Uni = one and corn =horn. More common examples of tautological definitions are:
Customer ID is defined as “The identifier of the Customer”.
“Metal is something made of metal”.
2. Parts of speech should agree, for example, a noun should be defined with a noun,
verb with a verb, etc. Do not use an adverb phrase like “is when” or “is where” in the definition text.
3. All terms used in the definition should be either defined in the glossary or lexicon themselves
or be considered atomic/basic terms.
4. State distinguishing characteristics precisely.
5. Avoid classes that are too broad. A BT or class should be large enough to include all
members but not too large that it doesn’t add any value.


Summary

The importance of glossaries and dictionaries cannot be overestimated. Definitions facilitate communication, and help the business make accurate decisions.

In short, definitions should be clear, exact and complete; if not, a spoon can be confused with a knife or fork if the definition reads like this: “a spoon is something to eat with.” It is this
sort of confusion that costs businesses lots of money from bad business decisions and bad practices. Exposing definitions and making people more aware of the meanings of terms can help the business
in a myriad of ways.

This article was also published in March of 2005 on B-Eye-Network.com at http://www.b-eye-network.com/view/734. Check out this publication – It is well worth repeat visits. RSS

Share this post

Bonnie O'Neil

Bonnie O'Neil

Bonnie O'Neil is a Principal Computer Scientist at the MITRE Corporation, and is internationally recognized on all phases of data architecture including data quality, business metadata, and governance. She is a regular speaker at many conferences and has also been a workshop leader at the Meta Data/DAMA Conference, and others; she was the keynote speaker at a conference on Data Quality in South Africa. She has been involved in strategic data management projects in both Fortune 500 companies and government agencies, and her expertise includes specialized skills such as data profiling and semantic data integration. She is the author of three books including Business Metadata (2007) and over 40 articles and technical white papers.

scroll to top