Databases need to do the job better – without having to upgrade.

Published in TDAN.com April 2003

Every company wants more efficiency from databases and servers. Databases need to do the job better – without having to upgrade.

For a database administrator, this represents quite a challenge. How might you bring about reductions in memory and disk usage, while proliferating indexes for better query performance with minimal loading performance impact? Not to mention being able to implement efficient composite key indexes, and tune indexes to meet the workload?

The fact is, the opportunity is there to achieve efficiency improvements in databases, promoting faster application performance overall, as well as enhancing the ability to deploy more applications while using fewer disks for database applications and queries. All while avoiding the need to upgrade.

It’s an opportunity that can remove constraints on business processes and avoid the option of having to make less data available … an option that would be likely to lead to poorer management decisions based on poorer data. So how can this opportunity be exploited?

Data Indexing

Consider how data indexing affects the performance of your databases. When was the last time you heard about something new in this area? For decades, it has hardly been talked about. But the word in the industry today is: neglect this area at your peril!

A quiet revolution is taking place in indexing technology – even a quantum leap – that offers enormous potential for database administrators to make a significant impact on business performance.

This article looks briefly at existing data indexing technologies and a new entrant to the market. The main example used in this article is an Oracle relational database, though the principles hold true across all database types.

The improvements in database performance come from new indexing technology called “Adaptive Addressing”. It provides an indexing solution that can improve existing Oracle relational database performance by up to 10 times or, alternatively, by giving developers a set of Application Programming Interfaces (APIs), conventional indexes can be outperformed by up to one hundred (100) times.

So, why have the approaches previously available left a marked gap in the data
indexing arena? Let’s review the steps along the road to where we are today.

Inside Indexes

Indexes are used inside databases to enable fast query access to data stored in tables in random order. Without indexes a query is forced to scan an entire table or partition to find the required data. While indexes can speed up data retrieval, they incur a significant overhead during table inserts, updates and deletes; a compromise must often be struck between data retrieval and data loading performance.

New indexing strategies have been introduced into Oracle: hashed clusters and, more recently, bit-map indexes. These index types have fundamentally different structures and so can exhibit radically different performance.

We all know that a database will only go as fast as its slowest component. With Oracle, it’s the slow disks. A key strategy in optimising Oracle performance is to minimise the number of disk operations within the constraints of the available memory resources, but even large disk operations are not the complete solution and can result in a poor response time in an OLTP environment.

Other considerations are the number of disk operations, taking into account tables residing in cache or on a disk on a busy system.

Disk cost must be considered too, and is incurred when data is inserted into a table. This cost is usually marginal on a heap-organised table; but adding indexes will add significantly to this cost.

Yet another enemy of good performance is poor concurrency, that is, contention among users locking each other out from shared areas of the database.

The overall benefit of an index depends not only on how it reduces the cost of queries, but also how it adversely affects table updates. Different index types vary in their query benefits and performance impacts.

B-Tree Index

B-Tree is the index you get by default when you create an index on a table. It is the default choice because it gives query flexibility and is reasonably insensitive to different key distributions. But this comes with a price – it is an expensive index to maintain for inserts, updates and deletes. It can incur 2 disk operations (or more) for a single table row insert.

Hashed Cluster

Hashed clusters were introduced by Oracle many years ago; yet how many times have you used them? The downside to hashed indexing is that you cannot do range queries, but there’s more…

The index itself is simply an algorithm for converting a key value into a pseudo-random table address. The index occupies no memory and requires no disk space; but its existence has a dramatic impact on the underlying table. The table is no longer heap organised, and consecutively inserted rows now occupy random locations in the table, rather than adjacent positions. This means that, while the index itself incurs no overhead (because it occupies no memory and no disk), an insert into a large table now requires 2 disk I/Os to update the table; namely, one read I/O to fetch the target table block and one write I/O updated table block back on disk. In other words, the maintenance costs are no better than a B-Tree.

Hashed indexing is a highly effective technique for minimizing the cost of queries against static reference data – provided that you don’t want to do any range queries.

Bit Map

The bit map index has only recently been introduced by Oracle. Generally speaking, it is not a good idea to use bit map indexes in an OLTP environment, as this can lead to performance problems in an operational system with multiple users or processes sharing the index.

Variations

Oracle has introduced a number of variations for the B-Tree to alleviate some of the performance issues experienced.

Firstly, compressed keys. They save disk space and particularly help with larger and lower cardinality keys. It allows many more index entries to be squeezed into an index block, with obvious advantages.

Secondly, reverse keys help with B-Tree balancing and organisation issues. They eliminate hot spots in the structure that may arise from key clustering. Reverse keys prevent range searching, but give a more flexible alternative to the hashed cluster and can give better maintenance performance than hashing on small to medium tables.

However, neither of the schemes above resolves the fundamental performance issue inherent in a large B-Tree structure with its legion of leaf blocks.

Finally, with Index Organised Tables (IOT), the entire row data is stored in the index itself. This gives better query performance because we can avoid extra disk I/Os to fetch the row data. However, because it stores the row data less compactly than an equivalent heap organised table – it adversely affects full table scans. Moreover, index maintenance performance can be poor. A large row size reduces the efficiency of the structure, which means that performance bottoms out more rapidly.

Does this sound familiar? An IOT table exhibits similar performance to a hash cluster organised table. Or put it another way, a hashed cluster is effectively an alternative Index Organised Table.

The Future

Each of the indexes discussed aims to improve the performance of selective queries, but can incur a severe penalty on table inserts, updates and deletes; and can introduce contention among users and processes. These are significant issues in an operational system where indexes are constantly changing.

The best approach is where an index structure offers B-Tree like flexibility without the associated maintenance cost; allowing you to proliferate indexes to meet query demands and avoid undo impact elsewhere.

Alternative data index structures do exist and some are commercially available for an Oracle environment. Adaptive Addressing technology is one such solution, suited to the majority of data stored in commercial systems, of a general-purpose scalar data nature. If a system suffers poor performance because of disk bound indexes or index contention, then these alternatives are worth investigating. It could save a costly application re-engineering exercise.

Developers are already finding that if they replace B-Tree indexes with this technology they can deliver memory savings of up to 90 percent. The impact of Adaptive Addressing intelligent indexing in an Oracle environment can mean an improvement of existing Oracle relational database performance by up to ten times. This is achieved by directly replacing existing conventional indexes. However, it is a set of APIs that enable application developers to easily create new solutions that the greatest benefits can come. This technology can be easily integrated into new applications within any database out-performing existing conventional indexes in increases in performance of up to 100 times.

And in terms of implementation for data administrators, data indexing technology can be added with no ongoing work of any description!

Share

submit to reddit

About Duncan Pauly

Duncan is a founder of CopperEye and the inventor of Adaptive Addressing (tm) technology.  Before founding CopperEye Duncan worked as a leading strategy and product consultant for Convergys.  He has widespread experience in product architecture, design and strategy for some of the leading technology and telecommunications companies, including Vodafone, Oracle and Hutchison Telecom. 

In 1992 he founded Xi Systems Ltd, delivering customer care and billing systems to worldwide telecommunications providers.  The company grew to 50 employees and was sold to Convergys in 1995. Duncan has over 20 years of technical experience.  He graduated from Southampton University, UK, with a First in Mathematics.

For further information contact:
CopperEye: 44 (0)1225 745408
email:  via  fiona.moon@coppereye.com
www.coppereye.com

Top