Are indexes good or bad for a large database?

A:

It is good practice to put the index on each column which you used in a WHERE clause.

Salil 2010-04-26 07:50:56

+6 A:

The article says that when dealing with very large data sets, where the amount of rows you need to work with are approaching the number of rows that is in the table, using an index might hurt performance.

In this case, going through the index will indeed hurt performance, as long as you need more data than is present in the index.

To go through the index, the database engine first has to read large parts of the index table (it is a type of table), then for each row (or set of rows) from this result, go to the real table and start cherrypicking pages to read.

If, on the other hand, you only need to retrieve columns that area already part of the index table, then the database engine only has to read from that, and not continue on to the full table for more data.

If you end up reading most or close to most of the actual table in question, all the work required to deal with the index might be more overhead than just doing a full table-scan to begin with.

Now, this is all the article is saying. For most work dealing with a database, using indexes is the exact right thing to do.

For instance, if you need to extract a small set of rows, going through an index instead of a full table scan will be many order of magnitudes faster.

In any case, if you're in doubt, you should do some performance profiling to find out how your application behaves under different types of loads, and then start tweaking, don't take a single article as a silver bullet for anything.

For instance, one way to speed up the example queries that does a count on the pad column in the article, would be to create a single index that covered both val and pad, in this way, the count would simply be a index-scan, and not a index-scan + table-lookup, and would run faster than the full table-scan.

Your best option is to know your data, and to experiment, and to know how the tools you use work, so indeed, learn more about indexes, but in the end, it is you who decides what is best for your program.

Lasse V. Karlsen 2010-04-26 07:52:05

As the article says, "Even if you look at 1% or rows or less full table scan may be faster." So not to be lame, but the answer seems to be "it depends on the specific situation."

Yar 2010-04-26 07:58:36

Yes, indeed it does.

Lasse V. Karlsen 2010-04-26 08:00:44

+2 A:

As always, it depends. I've so far never ran into a scenario as described in that blog posts. Using indexes on my queries for large (50+ million rows) has been on the order of 100 to 10000 times faster than doing a full table scan on these big tables.

There's probably no silver bullet here, you have to test for your particular data and your particular queries.

nos 2010-04-26 07:52:58

ansaurus

tags:

views:

answers:

Are indexes good or bad for a large database?

related questions