Tools for a Comprehensive Database Environment – Part 2

There are five major RDBMS products: Oracle, IBM DB2, Microsoft SQL Server, Sybase SQL Server, and Informix. Each is a complete, full-function RDBMS. An organization can install and use the DBMS as delivered, but will quickly recognize that the functionality needed to adequately support large-scale database development is not provided adequately by the RDBMS alone.

The administration and maintenance of database applications is time-consuming if you use only the standard features of the DBMS. Fortunately, many tools that enhance the functionality of relational database management systems are available from third party vendors. These tools ease the administrative burden and reduce the possibilities of database error.

The true need for these tools becomes apparent simply by reviewing the sheer number of products that are available. Most organizations implement at least one or more add-on tools for their DBMS. But, many varieties of tools are available that fulfill market niches not adequately supported the major RDBMS vendors.

In the second of two parts, this article addresses database tools that focus on the analysis, reporting, and utilities of database administration. The first part of the article addressed database tools that focus on development.

The following is a rundown of the categories of products.

Covered in Article Part 1 – Development Tools
(Link to Article Part 1)

Client/Server Tools
Compression Tools
Database Modeling and Design Tools
Database Object Altering Tools
Database Object Migration Tools
Data Warehousing Tools
Debugging Tools
Programming and Development Tools
Propagation Tools
Operational Support Tools
Referential Integrity Tools
Replication Tools
Security Tools
Table Editors
Testing Tools
Synopsis

Auditing Tools
Catalog Query and Analysis Tools
Database Analysis Tools
Miscellaneous Tools
Online Standards Manuals
Performance Monitors
Query Tools
Repositories
Space Management Tools
SQL Analysis Tools
Utility Enhancement Tools
Utility Management Tools
Synopsis

1Auditing Toolsback to top

An audit is the examination of a practice to determine its correctness. Database auditing software therefore should help in monitoring the data control, data definition, and data integrity in the database environment. Several RDBMS products provide limited audit mechanisms, but these features are typically difficult to implement, use, and maintain.

Typical features that should be auditable include:

  • Authorization failures
  • Grant and revoke SQL statements
  • DDL statements issued
  • DML statements issued
  • SQL bind requests or new program invocations
  • Utility executions

An audit tool should provide five important features:

  1. The first feature an auditing tool should provide is the capability to read the database logs, which are always produced, and report on activity as needed. This reduces overhead because it uses the regular processing features of the database instead of an additional tracing feature, which might increase overhead.
  2. The second feature of the auditing tool is a set of pre-packaged reports for all data modification from the database logs. The reports should be easy to produce and easy to read.
  3. The third feature of an auditing tool is reporting who (by user or authorization ID) makes each change, and also showing a before and after image of the changed data.
  4. The fourth feature the auditing tool should provide is the capability to interface with any auditing features inherent to the DBMS if so desired.
  5. Finally, the auditing tool should provide both standard reports and the capability to create site-specific reports.

If your shop has strict auditing requirements, an auditing tool is almost mandatory because of the weak inherent auditing capabilities of most DBMS products.

Catalog Query and Analysis Toolsback to top

The system catalog or data dictionary (depending on the DBMS) contains a wealth of information essential to the operation of DBMS. Information about all database objects, authority, and recovery is stored and maintained in the system catalog. The system catalog is composed of relational tables, and can be queried using SQL and/or vendor-supplied stored procedures. Some DBMS vendors provide system catalog views that are easier to query and monitor. However they are accessed, these tables provide a base of information for many monitoring and administrative tasks.

But coding SQL can be a time-consuming process. Often, you must combine information from multiple catalog tables to provide the user with facts relevant for a particular task. And, for the most part, the DBMS vendors have made the tables difficult to understand and query by using odd naming conventions, denormalized structures, unused columns, poor data type choices, and little documentation. When stored procedures and views are provided querying these tables is easier, but sometimes crucial information is missing from these pre-canned “queries.”

Add-on tools can ease the burden of developing complex SQL queries to access the system catalog tables. The basic feature common to all catalog tools is the capability to request catalog information using a GUI (or panel-driven) interface without coding SQL statements. Analysts can obtain rapid access to specific facts stored in the system catalog without the burden of coding (sometimes quite complex) SQL.

System catalog tools that provide only this level of capability are rudimentary tools at best. Most of these tools provide much more capability. Instead of merely enabling data access, many tools can do one or more of the following:

  • Create syntactically correct DDL statements for all database objects by reading the appropriate system catalog tables. These statements are generally executed immediately or saved in a data set for future reference or use.
  • Modify any “updateable” statistical columns using a non-SQL interface.
  • Create syntactically correct authorization/security statements from the catalog in the same way that DDL is generated.
  • Perform “drop analysis” on a SQL drop statement. This analysis determines the effect of the cascading drop by detailing all dependent objects and security that will be deleted as a result of executing the drop.
  • Provide a hierarchic listing of database objects. For example, if a specific table is chosen, the tool can migrate quickly up the hierarchy to show its tablespace and database, or down the hierarchy to show all dependent indexes, views, synonyms, aliases, referentially connected tables, and plans.
  • Create and drop database objects, and grant and revoke security from a screen without coding SQL. Additionally, some tools log all drops and revokes such that they can be undone in the event of an inadvertent drop or revoke execution.
  • Operate directly on the system catalog or on a copy of the system catalog to reduce system-wide contention.
  • These features aid the DBA in performing his day-to-day duties. Furthermore, a catalog query tool can greatly diminish the amount of time required for a junior DBA (or new staff member) to become a productive member of the DBA team.

Database Analysis Toolsback to top

Most RDBMSs do not provide an intelligent database analysis capability. Instead, a database administrator or performance analyst must keep a vigilant watch over each database and its objects using system catalog views and queries or a system catalog tool. This is not an optimal solution because it relies on human intervention for efficient database organization, opening up the possibility of human error, forgetting to monitor, and misinterpreting analyzed data.

Fortunately, database analysis tools exist that can proactively and automatically monitor your database environment. This monitoring can

  • Collect statistics for tables and indexes. These statistics can be standard statistical information, extended statistics capturing more information (for example, data set extents), or a combination of both.
  • Read the underlying data sets for the database objects to capture current statistics, read the database statistics from the system catalog, read tables unique to the tool that captured the enhanced statistics, or any combination of these three.
  • Set thresholds whereby the automatic scheduling of a database reorganization is invoked based on current statistics.
  • Provide a series of canned reports detailing the potential problems for specific database objects.

Repositoriesback to top

A repository stores information about an organization’s data assets. Repositories are used to store meta-data, or data about data. They frequently are used to enhance the usefulness of application development.

In choosing a repository, base your decision on the meta-data storage and retrieval needs of your entire organization, not just the databases you wish to support. Typically, a repository can

  • Store information about the data, processes, and environment of the organization.
  • Support multiple ways of looking at the same data. An example of this concept is the three-schema approach, in which data is viewed at the conceptual, logical, and physical levels.
  • Store in-depth documentation as well as produce detail and management reports from that documentation.
  • Support data model creation and administration. Integration with popular CASE tools is also an important evaluation criterion.
  • Support change control.
  • Enforce naming conventions.
  • Generate copy books from data element definitions.

These are some of the more common functions of a repository. When choosing a repository for database development, the following features generally are desirable:

  • The data stores used by the repository can be housed in tables for the DBMS you are using. This enables your applications to directly read the data dictionary tables. For example, if you are primarily an Oracle shop, you should favor using a repository that stores its meta data information in Oracle tables.
  • The repository can directly read the system catalog or views on the system catalog for each DBMS you use. This ensures that the dictionary has current information on database objects.
  • If the repository does not directly read the system catalog, an interface should be provided to ease the population of the repository using the system catalog information.
  • The repository provides an interface to any modeling and design tools used for the generation of database objects.

This section is a brief overview of repositories. An extended discussion of repository technology is beyond the scope of this article.

Space Management Toolsback to top

Most DBMSs provide basic statistics for space utilization, but the in-depth statistics required for both space management and performance tuning are usually inadequate for heavy duty administration.

An typical example of a missing space management element is the capability to monitor the requirements of the underlying files used by the DBMS. When these files go into extents or become defragmented, performance can suffer. Without a space management tool, the only way to monitor this information is with arcane and difficult-to-use operating system commands. This can be a tedious exercise.

Additionally, each DBMS allocates space differently. The manner in which the DBMS allocates this space can result in the inefficient disk usage. Sometimes space is allocated but the database will not use it. A space management tool is the only answer for ferreting out the amount of allocated space versus the amount of used space.

Space management tools often interface with other database and systems management tools such as operating system space management tools, database analysis tools, system catalog query and management tools, and database utility generators.

SQL Analysis Toolsback to top

The development of SQL to access tables is usually the responsibility of an application development team. With SQL’s flexibility, the same request can be made in different ways. Because many of these ways are inefficient, the performance of an application’s SQL could fluctuate wildly unless it is analyzed by an expert before implementation.

The EXPLAIN or “show plan” commands provide information about the access paths used by SQL queries by parsing SQL in application programs and placing encoded output into a PLAN_TABLE or producing a standard access path report. To gauge efficiency, a DBA must decode this data and determine if a more efficient access path is available.

SQL code reviews are required to ensure that optimal SQL design techniques are used. SQL code walkthroughs typically are performed by a DBA or someone with experience in SQL coding. This walkthrough must consist of reviews of the SQL statements, the selected access paths, and the program code in which the SQL is embedded. It also includes an evaluation of database statistical information to ascertain whether production-level statistics were used at the time of the EXPLAIN.

A line-by-line review of application source code and EXPLAIN output is tedious and prone to error, and can cause application backlogs. A plan analysis tool can greatly simplify this process by automating major portions of the code review process. A plan analysis tool typically can

  • Analyze the SQL in an application program, describing the access paths chosen in a graphic format, an English description, or both.
  • Issue warnings when specific SQL constructs are encountered. For example, each time a sort is requested (by ORDER BY, GROUP BY, or DISTINCT) a message is presented informing the user of the requisite sort.
  • Suggest alternative SQL solutions based on an “expert system” that reads SQL statements and their corresponding PLAN_TABLE entries and poses alternate SQL options.
  • Extend the rules used by the “expert system” to capture site-specific rules.
  • Analyze at the subsystem, instance, server, application, plan, package, or SQL statement level.
  • Store multiple versions of EXPLAIN output and create performance comparison and plan history reports.

Currently, no tool can analyze the performance of the application code in which the SQL is embedded. For example, consider an application program that embeds a singleton SELECT inside a loop. The singleton SELECT requests a single row based on a predicate, checking for the primary key of that table. The primary key value is changed for each iteration of the loop such that the entire table is read from the lowest key value to the highest key value.

A plan analysis tool probably will not flag the SQL statement because the predicate value is for the primary key, which causes an indexed access. It could be more efficient to code a cursor, without a predicate, to retrieve every row of the table, then fetch each row one by one. This method might use parallel access, reducing I/O and elapsed time and thereby enhancing performance. This type of design problem can be caught only by a trained analyst during a code walkthrough. Although a plan analysis tool significantly reduces the effort involved in the code review process, it cannot eliminate it.

There are two features that are required for any plan analysis tool:

  • It must be capable of interpreting standard EXPLAIN or “show plan” output.
  • It must automatically scan EXPLAIN and “show plan” data (possibly with application source code), and report on the selected access paths and the predicted performance.

Performance Monitorsback to top

Performance monitoring and tuning can be one of the most time-consuming tasks for large or critical applications. Database performance monitoring and analysis tools support many performance-oriented requests in many ways. For example, database performance tools can operate

  • In the background mode as a batch job reporting on performance statistics written by the DBMS trace facility.
  • In the foreground mode as an online monitor that either traps trace information or captures information from the DBMS control blocks as applications execute.
  • By sampling the database kernel and user address spaces as the program runs and by capturing information about the performance of the job independent of database traces.
  • By capturing database trace information and maintaining it in a history file (or table) for producing historical performance reports and for predicting performance trends.
  • As a capacity planning device by giving the tool statistical information about an application and the environment in which it will operate.
  • As an after-the-fact analysis tool on a workstation for analyzing and graphing all aspects of application performance and system-wide performance.

Each database performance monitoring tool will support one or more of these features. The evaluation of database performance monitors is a complex task. Often more than one performance monitor is used at a single site.

Modern database performance tools can set performance thresholds that, once reached, will alert the DBA or perform another task to report on, or actually fix the problem. These tools are typically agent-based. An agent is a piece of independent code that runs on the database server looking for problems. It interacts with, but is not reliant upon, a console running on another machine that is viewed by the DBA. This agent architecture enables efficient database monitoring because the agent is not tied to a workstation and can act independently sending information to the DBA only when required.

Query Toolsback to top

Most DBMSs bundle a simple query tool with the product. However, many shops find these tools inadequate for developing professional, formatted reports or complete applications. It can be inadequate also for inexperienced users or those who want to develop or execute ad hoc queries.

Query tools address each of these deficiencies. The capability to format reports without programming is probably the greatest asset of these tools. Another important feature is the capability to develop data manipulation requests without using SQL.

Many vendors provide table query and reporting tools that can be used to enhance ad hoc query capabilities. These products typically can

  • Provide standard query formats and bundled reports.
  • Provide access to multiple file formats and heterogeneous databases.
  • Use static SQL and dynamic SQL for stored queries.
  • Execute database commands from the query tool.

Database query tools typically use point-and-click, drag-and-drop technology to greatly ease the report generation process. Fourth-generation languages (4GLs) are also gaining popular for accessing relational data. These products provide more functionality than a report writing tool, but with the GUI front-end that makes them easier to use than 3GL programming languages such as COBOL and C. 4GL tools typically work in one of three ways:

  • Queries are developed using 4GL syntax, which then is converted “behind the scenes” into SQL queries.
  • SQL is embedded in the 4GL code and executed much like SQL embedded in a 3GL.
  • A hybrid of these two methods, in which the executed SQL is either difficult or impossible to review.

Online Standards Manualsback to top

Products exists which provide “canned” standards for implementing, accessing, and administering databases. These tools are particularly useful for new database shops. By purchasing an On-Line standards manual these shops can quickly some up-to-speed with each new DBMS.

However, mature shops can also benefit from these types of products if the third party vendor automatically ships updates whenever the DBMS vendor new version or release of their product. This can function as cheap training in the new DBMS release (e.g. for migrating from Oracle7 to Oracle8).

A product containing database standards should:

  • provide online networked access so all developers and DBAs can access the manual
  • be extensible, so additional standards can be added
  • be modifiable, so the provided standards can be altered to suit prior shop standards (naming conventions, programming standards, etc.)

Utility Enhancement Toolsback to top

Most database utilities that ship free with the DBMS are notorious for their inefficiency, especially on very large tables. However, these utilities are required to populate, administer, and organize your databases. The typical utilities that are provided are load, unload, reorganization, backup, recover, and integrity checking.

Several vendors provide support tools that replace the database utilities and provide the same functionality more efficiently. For example, one vendor claims that its reorganization utility executes six to ten times faster than the DB2 REORG utility. These claims must be substantiated for the applications at your organization (but the this claim is quite believable).

Before committing to an alternate utility tool, be sure that it conforms to the following requirements:

  • Does not subvert the integrity of the data in the database.
  • Minimally provides the same features as the corresponding database utility. For example, if the DB2 REORG utility can REORG both indexes and tablespaces, the enhanced REORG tool must be able to
    do the same.
  • Does not subvert standard database features, when possible. For example, DB2 image copies are maintained in the DB2 system catalog. The enhanced backup tool, therefore, should store its image
    copies there as well.
  • Provides an execution time at least twice as fast as the corresponding DB2 database utility. For example, if the Sybase dump utility requires 20 minutes to backup a table, the enhanced backup
    tool must dump the same table in at least 10 minutes. (Of course, this should not be a hard-and-fast rule. Sometimes even a moderate increase in processing time is sufficient to cost-justify a
    third party utility tool.)
  • Corrects the deficiencies of the standard database utilities, when possible. For example, the DB2 LOAD utility will not load data in sequence by the clustering index. The enhanced tool should
    provide this capability.
  • When testing utility tools from different vendors, ensure that you are conducting fair tests. For example, always reload or recover prior to testing reorganization utilities or you may skew
    your results due to different levels of table organization. Additionally, always run the tests for each tool on the same object with the same amount of data.

Utility Management Toolsback to top

One last category of database tool is the utility manager. This type of tool provides administrative support for the creation and execution of database utility jobstreams. These utility generation and management tools can

  • Automatically generate utility parameters, JCL, or command scripts.
  • Monitor the database utilities as they execute.
  • Automatically schedule utilities when exceptions are triggered.
  • Restart utilities with a minimum of intervention. For example, if a utility cannot be restarted, the tool automatically terminates the utility before resubmitting it.

Miscellaneous Toolsback to top

Many types of database tools are available. The categories in this article cover the major types of tools, but not all tools can be easily pigeonholed. For example, consider a database space calculator. It reads table DDL and information on the number of rows in the table to estimate space requirements. A space calculator often is provided with another tool such as a space management tool or a database design and modeling tool.

Furthermore, new types of products are being announced quite regularly. For example, one vendor recently announced a product that analyzes all of the SQL that accesses a particular table and uses and set of expert rules to automatically recommend the most efficient indexes. Be sure to keep your eyes open for new and useful types of tools not mentioned in this article.

Synopsisback to top

Third party add-on tools can significantly improve the efficiency of developing applications that access relational data. When evaluating products, look for features important to your organization. Consider adopting checklists for product comparisons based upon the features discussed in this article. And remember, most RDBMS products leave quite a bit to be desired in the administration, data access, performance monitoring, and application development areas. Third party tools can help alleviate these pains.

Share

submit to reddit

About Craig Mullins

Craig S. Mullins is a data management strategist and principal consultant for Mullins Consulting, Inc. He has three decades of experience in the field of database management, including working with DB2 for z/OS since Version 1. Craig is also an IBM Information Champion and is the author of two books: DB2 Developer’s Guide and Database Administration:The Complete Guide to Practices and Procedures. You can contact Craig via his website.

Top