views:

3994

answers:

7

I would like to know if there are general rules for creating an index or not. How do I choose which fields I should include in this index or when not to include them?

I know its always depends on the environement and the amount of data, but I was wondering if we could make some globally accepted rules about making indexes in Oracle.

+4  A: 

There are some things you should always index:

  • Primary Keys - these are given an index automatically (unless you specify a suitable existing index for Oracle to use)
  • Unique Keys - these are given an index automatically (ditto)
  • Foreign Keys - these are not automatically indexed, but you should add one to avoid performance issues when the constraints are checked

After that, look for other columns that are frequently used to filter queries: a typical example is people's surnames.

Tony Andrews
Well that's not -exactly- true. You can create a PK without it generating an index automatically -- by specifying an existing index to use.
Fair point - I have updated my text
Tony Andrews
A: 

Usually one puts the ID columns up front and those usually identify the rows uniquely. A combination of columns can also do the same thing. As an example using cars... tags or license plates are unique and qualify for an index. They (the tags column) can qualify for the primary key. The owners name can qualify for an index if you are going to search on name. make of car really shouldn't get an index in the beginning as it's not going to vary too much. Indexes don't help if the data in the column doesn't vary too much.

Take a look at the SQL - what are the where clauses looking at. Those may need an index.

Measure. What is the issue - pages/queries taking too long ? what's being used for the queries. Create an index on those columns.

Caveats: indexes need time for updates and space.

and sometimes full table scans are quicker than an index. small tables can be scanned quicker than getting the index and then hitting the table. Look at your joins.

jim
A: 

Look into Database Normalization - you'll find a lot of good, industry standard rules about what keys should exist, how databases should be related, and hints on indexes.

Adam Davis
+1  A: 

From the 10g Oracle Database Application Developers Guide - Fundamentals, Chapter 5:

In general, you should create an index on a column in any of the following situations:

  • The column is queried frequently.
  • A referential integrity constraint exists on the column.
  • A UNIQUE key integrity constraint exists on the column.

Use the following guidelines for determining when to create an index:

  • Create an index if you frequently want to retrieve less than about 15% of the rows in a large table. This threshold percentage varies greatly, however, according to the relative speed of a table scan and how clustered the row data is about the index key. The faster the table scan, the lower the percentage; the more clustered the row data, the higher the percentage.
  • Index columns that are used for joins to improve join performance.
  • Primary and unique keys automatically have indexes, but you might want to create an index on a foreign key; see Chapter 6, "Maintaining Data Integrity in Application Development" for more information.
  • Small tables do not require indexes; if a query is taking too long, then the table might have grown from small to large.

Some columns are strong candidates for indexing. Columns with one or more of the following characteristics are good candidates for indexing:

  • Values are unique in the column, or there are few duplicates.
  • There is a wide range of values (good for regular indexes).
  • There is a small range of values (good for bitmap indexes).
  • The column contains many nulls, but queries often select all rows having a value. In this case, a comparison that matches all the non-null values, such as:

    WHERE COL_X >= -9.99 *power(10,125) is preferable to WHERE COL_X IS NOT NULL

    This is because the first uses an index on COL_X (assuming that COL_X is a numeric column).

Columns with the following characteristics are less suitable for indexing:

  • There are many nulls in the column and you do not search on the non-null values.
DCookie
It's dangerous to say> There is a small range of values (good for bitmap indexes).This will screw up an OLTP database every single time.
A: 

Wow, that's just such a huge topic, it's hard to answer in this format. I srtongly recommend this book.

Relational Database Index Design and the Optimizers by Tapio Lahdenmaki

You don't just use indexes to make table access faster, sometimes you make indexes to avoid table access altogether. Something not mentioned yet but vital.

There's a whole science to this if you really want to make your database perform maximally.

Ah, one specific optimization to Oracle is building reverse key indexes. If you have a PK index of a monoatomically increasing value, like a sequence, and you have highly concurrent inserts and don't plan to range scan that column then make it a reverse key index.

See how specific these optimizations can be?

+7  A: 

The Oracle documentation has an excellent set of considerations for indexing choices: http://download.oracle.com/docs/cd/B28359_01/server.111/b28274/data_acc.htm#PFGRF004

Quoting:

  • Consider indexing keys that are used frequently in WHERE clauses.

  • Consider indexing keys that are used frequently to join tables in SQL statements. For more information on optimizing joins, see the section "Using Hash Clusters for Performance".

  • Choose index keys that have high selectivity. The selectivity of an index is the percentage of rows in a table having the same value for the indexed key. An index's selectivity is optimal if few rows have the same value. Note: Oracle automatically creates indexes, or uses existing indexes, on the keys and expressions of unique and primary keys that you define with integrity constraints. Indexing low selectivity columns can be helpful if the data distribution is skewed so that one or two values occur much less often than other values.

  • Do not use standard B-tree indexes on keys or expressions with few distinct values. Such keys or expressions usually have poor selectivity and therefore do not optimize performance unless the frequently selected key values appear less frequently than the other key values. You can use bitmap indexes effectively in such cases, unless the index is modified frequently, as in a high concurrency OLTP application.

  • Do not index columns that are modified frequently. UPDATE statements that modify indexed columns and INSERT and DELETE statements that modify indexed tables take longer than if there were no index. Such SQL statements must modify data in indexes as well as data in tables. They also generate additional undo and redo.

  • Do not index keys that appear only in WHERE clauses with functions or operators. A WHERE clause that uses a function, other than MIN or MAX, or an operator with an indexed key does not make available the access path that uses the index except with function-based indexes.

  • Consider indexing foreign keys of referential integrity constraints in cases in which a large number of concurrent INSERT, UPDATE, and DELETE statements access the parent and child tables. Such an index allows UPDATEs and DELETEs on the parent table without share locking the child table.

  • When choosing to index a key, consider whether the performance gain for queries is worth the performance loss for INSERTs, UPDATEs, and DELETEs and the use of the space required to store the index. You might want to experiment by comparing the processing times of the SQL statements with and without indexes. You can measure processing time with the SQL trace facility.

David Aldridge
A: 

For a starter:

http://www.youtube.com/watch?v=7ZgG7ttJTSc