What’s in a Name?

Establishing and enforcing naming conventions is typically one of the first duties to be tackled when implementing new software. Adequate thought and preparation is required in order for such a
task to be successful. What amount of effort should be extended in the creation of appropriate database naming standards? Are current industry standards acceptable? Shakespeare, many, many years
ago, may have said it best when he wrote: “What’s in a name? That which we call a rose by any other name would smell as sweet.” But, if that is true, then why do those of us in the DP industry
spend so much time and effort developing and enforcing naming standards? Maybe what something is called is not quite so trivial a matter after all!

Well, we know what Shakespeare was trying to convey: the name by which we call something has no effect upon the actual object. Calling a desk a lamp will not turn it into a lamp: it is still a
desk. Sometimes DP professionals, and database developers in particular, forget this. This article will provide a different approach to database naming standards by ignoring all of the basic
generalizations that were foisted upon us in Database 101 class and taking a fresh look at the naming of relational database objects. The article will concentrate on DB2 objects, but the arguments
made in the article are just as valid for other relational databases as well.

Tables and Their Cousins

In an RDBMS the primary data store is the table (this is not 100% true from a physical implementation perspective, but it will suffice for the purposes of this article). A table consists of
multiple rows, each with a fixed and unchanging number of defined columns. However, most RDBMSs provide alternative means of accessing data from these tables. For example, DB2 allows the following:

  • ALIAS – an alternative name that can be used in SQL statements to refer to a table or a view in the same or a remote DB2 subsystem.
  • SYNONYM – an alternative name that can be used in SQL statements to refer to a table or a view in the same DB2 subsystem. Synonyms are accessible only by the synonym owner.
  • VIEW – an alternative representation of data from one or more tables or views.
  • All these alternative means of access are similar in one way: they all present data by means of values in rows and columns. An end user need not know whether he or she
    is querying a table, an alias, a synonym, or a view. To the user, the results are the same – data represented by values in rows and columns.

This brings us to our first revision to the common wisdom for database naming standards. Use the exact same naming convention for tables, views, aliases, and synonyms. These four objects all
logically refer to the same thing in the relational model – a representation of data in terms of columns and rows.

Many shops implement different naming conventions for each of these objects, but this needlessly complicates matters. Furthermore, stop to consider whether it really make sense for them to be named
differently. Let’s examine the pros and cons.

Consider your current table naming conventions. If your shop is typical, you will find a convention that is similar to the one depicted in Figure 1. If they are different, pause for a moment to ask
yourself why. This is almost an industry standard for table naming. You most surely do not want to force every DB2 table to begin with a T (or have a strategically embedded T within the table
name). The name of a DB2 table should accurately and succinctly convey the contents of the data it contains. The naming convention displayed in Figure 1 accomplishes this.

Format:

aaa_dddddddddddddd

aaa

three character application identifier

dddddddddddddd

unique description up to 14 characters in length

Figure 1. Table naming convention.

So this brings us to our second rule for revising database naming conventions. Avoid embedding a ‘T’, or any other character, into table names to indicate that the object is a table. Likewise,
indicator characters should be avoided for any other table-like object (i.e. aliases, synonyms, and views).

Although most shops avoid embedding a ‘T’ in the table name, many of these same shops do embed a character into alias, synonym, and view names. The primary reason given is that the character
makes it easy to determine what type of object is being accessed just by looking at the name. There are two reasons why this is a bad idea. The first is a semantic reason, the second a flexibility
issue.

In semantic terms, an object’s name need only identify the object, not the object’s type. Consider the following arguments:

How are people named? Usually one can ascertain the gender of someone simply by knowing their name but would you banish all males named Chris, Pat, or Terry? Or maybe all females named Chris, Pat,
and Terry? After all, men and women are different. Shouldn’t we make sure that all men’s names are differentiated from women’s names? Maybe we should start all men’s names with an M and all
women’s names with a W? If we did, we’d sure have a lot of Marks and Wendys, wouldn’t we? The point here is that context enables us to differentiate men from women, when it is necessary. The
same can be said of database objects.

How are COBOL program variables named? Do you name your 01, 05, and 77 level variable names differently in your COBOL programs? For example, do all 01 levels start with ‘O’ (for one), all 05
levels start with ‘F’, and all 77 levels start with ‘S’? No? Why not? Isn’t this the same as forcing views to start with V (or having a strategically embedded V within the name)?

What about the naming of pets? Say I have a dog, a cat, and a bird. Now, I wouldn’t want to get them confused, so I’ll make sure that I start all of my dog names with a D, cat names with a C, and
bird names with a B. So, I’ll feed C_FELIX in the morning, take D_ROVER for a walk after work, and make sure I cover B_TWEETY’s cage before I go to bed. Sounds ridiculous, doesn’t it?

The whole point of this tirade is that if we don’t manufacture strange names in the real world, why would we want to do it with our DB2 objects? There is really no reason to embed special
characters into DB2 objects names to differentiate them from one another. It is very practical and desirable to name DB2 objects in a consistent manner, but that consistent manner should be well
thought out and should utilize the system to its fullest capacity wherever possible.

The second reason for this rule is to increase flexibility. Say, for example, that we have a table that for some reason is significantly altered, dropped, or renamed. If views are not constrained
by rigid naming conventions requiring an embedded ‘V’ in the name, then a view can be constructed that resembles the way the table used to look. Furthermore, this view can be given the same name
as the old table. This increases system flexibility and reduces program maintenance when change occurs.

Most users don’t care whether they are using a table, view, synonym, or alias. They simply want the data. And, in a relational database, tables, views, synonyms, and aliases all logically appear
to be identical to the end user: as rows and columns. It is true that there are certain operations that cannot be performed on certain types of views, and users who need to know this will generally
be sophisticated users. For example, very few shops allow end users to update any table they want using a SQL generator, a reporting tool (such as QMF or SPUFI), or some other tool that uses
dynamic SQL. Updates, deletions, and insertions (the operations which are not available to some views) are generally coded into application programs and executed in batch or via on-line
transactions. The end user does need to query tables dynamically. Now you tell me, which name will your typical end user remember more readily when he needs to access his marketing contacts:
MKT_CONTACT or VMKTCT01?

The sophisticated end users and administrative folks who need to know whether the object is a table, a view, a synonym, or an alias can readily obtain this information by querying the system
catalog or using an administrative DBA tool.

Further Arguments For Indicators

Some shops believe they have very valid reasons for embedding an object type indicator character into database objects – view names, in particular. Let’s examine these arguments.

Point

Embedding a V into our view names enables our DBAs to quickly determine which objects are views and which are tables.

Counterpoint

Many of these shops do not embed a T into the table name, but feel that a V in the view name is necessary. It is believed that the DBA will be able to more easily discern views from tables. But,
these same shops do not always force an S into synonym names or an A into alias names. Even if they do, it is usually overkill. Any good DBA already knows which objects are tables and which are
views, and if he or she doesn’t, a simple query against the system catalog will clarify the matter.

For example, in DB2, this query will list all table-like objects:

SELECT NAME, CREATOR, “TABLE”
FROM SYSIBM.SYSTABLES
WHERE TYPE = “T”
UNION ALL
SELECT NAME, CREATOR, “ALIAS”
FROM SYSIBM.SYSTABLES
WHERE TYPE = “A”
UNION ALL
SELECT NAME, CREATOR, “TEMPORARY TABLE”
FROM SYSIBM.SYSTABLES
WHERE TYPE = “G”
UNION ALL
SELECT NAME, CREATOR, “SYNONYM”
FROM SYSIBM.SYSSYNONYMS
UNION ALL
SELECT NAME, CREATOR, “VIEW”
FROM SYSIBM.SYSVTREE
ORDER BY 3, 1

Point

It is necessary to code view names differently so that users understand that they are working with a view and not all operations can be performed on the view.

Counterpoint

All operations can be performed on some views but not all operations can be performed on all tables! What if the user does not have the security to perform the operation? For example, what is the
difference, from the user’s perspective, between accessing a non-updateable view and accessing a table where only the SELECT privilege has been granted? None!

Use It or Lose It

Another common problem with database naming conventions is unnecessary size restrictions. Using DB2 as an example, most objects can have a name up to 18 characters long. But, in many instances,
shops establish naming standards that do not utilize all of the characters available. This is usually unwise.

This brings us to our third revision. Unless a compelling reason exists, ensure that your standards allow for the length of database object names to utilize every character available.

Refer to Figure 2 for maximum and recommended DB2 object name lengths. Notice that, except for indexes, the recommended length is equal to the maximum length for each object.



Database


Object



Maximum Length



Recommended Length

Database

8

8

Tablespace

8

8

Table

18

18

View

18

18

Alias

18

18

Synonym

18

18

Index

18

8

Storage group

8

8

Constraint

8

8

Column

18

18

Figure 2. Maximum and recommended DB2 object name lengths.

Why are indexes singled out in DB2? This is an example of a compelling reason to bypass the general recommendation. Developers can explicitly name DB2 indexes, but they cannot explicitly name DB2
index spaces. Yet, every DB2 index requires an index space name. The index space name will be implicitly generated by DB2 from the index name. If the index name is 8 characters or less in length,
then the index space name will be the same as the index name. However, if the index name is longer than 8 characters, DB2 will use an internal, proprietary algorithm to generate a unique, 8-byte
index space name. As this name cannot be determined prior to index creation, it is wise to limit the length of index names to 8 characters. This is a good example of the maxim that there are
exceptions to every rule.

Embedded Meaning

One final troublesome naming convention is embedding specialized meaning into database object names. The name of an object should reflect what that object is or represents. However, it should not
attempt to define the object. Revision number 4. Do not embed specialized meaning into database object names.

Let’s examine this revised rule by means of an example. Some shops enforce DB2 index naming conventions where the type of index is embedded in the index name. For example, the index name may
require an X in it, or a special character is reserved to indicate certain physical implementation characteristics (such as to clustering or uniqueness). Look at the (bad) standard shown in Figure
3.

Format Xaaaaayz
X constant index identifier
aaaa five character unique index description
y

indicator specifying type of index:


C = clustering


N = non-clustering

z

indicator specifying type of index:


P = primary key


F = foreign key


U = unique


N = non-unique

Figure 3. Sample (bad) index naming standard.

Note two potential problem areas with this standard:

  • An embedded X identifies this object as an index.
  • Embedded meaning in the form of indicators detailing the type of index.

The embedded indicator character ‘X’, although unnecessary, is not as evil as indicator characters embedded in table-like objects. Indexes are not explicitly accessed by users. Therefore, obscure
or difficult to remember naming conventions are not a problem. The same arguments hold true for tablespace names.

In fact, indicator characters may actually be helpful to ensure that tablespaces and indexes are never named the same. Tablespaces and indexes, within the same database, can never be named the
same. DB2 uses a name generation algorithm to enforce uniqueness if it is attempted.

This exception results in another revised rule. If you must use indicator characters in database names, use them only in objects which are never explicitly accessed by end users.

The second potential problem area poses quite a bit of trouble. Consider the following cases which would cause the embedded meaning in the index name to be incorrect:

  • The primary key is dropped.
  • A foreign key is dropped.
  • The index is altered from non-unique to unique (or vice versa) using a database alteration tool.

What if an index is defined for a foreign key, but is also unique? Should we use an ‘F’ or a ‘U’? Or do we need another character?

What if an index is modified from clustering to non-clustering? Why does this require an index name change? This may potentially confuse performance analysts accustomed to seeing the old index name
in access path reports.

Additionally, misconceptions can occur when clustering must be identified in the index name. For example, in DB2, if no clustering index is explicitly defined, DB2 will use the first index created
as a clustering index. Should this index be named with an embedded ‘C’ or not?

Let’s look at one final example from the real world to better illustrate why it is a bad idea to embed specialized meaning into names. Consider what would happen if we named corporations based
upon what they produce. When IBM began, they produced typewriters. If we named corporations like we name database objects, the company could have been named based upon the fact that they
manufactured typewriters when they began. IBM might have been called TIBM (the T is for typewriters). And guess what, they don’t make typewriters any longer. What should we do? Rename TIBM or live
with a name that is no longer relevant?

Synopsis

Naming conventions evoke a lot of heated discussion. Everybody has their opinion as to what is the best method for naming database objects. Remember, though, that it is best to keep an open mind.
Johnny Cash may have been upset that his father gave him a girl’s name in the hit song “A Boy Named Sue,” but that was before he knew why. If this article caused you to think about naming
conventions from a different perspective, then I will consider it a success.

Share

submit to reddit

About Craig Mullins

Craig S. Mullins is a data management strategist and principal consultant for Mullins Consulting, Inc. He has three decades of experience in the field of database management, including working with DB2 for z/OS since Version 1. Craig is also an IBM Information Champion and is the author of two books: DB2 Developer’s Guide and Database Administration:The Complete Guide to Practices and Procedures. You can contact Craig via his website.

Top