views:

174

answers:

3

I know just enough about SQL tuning to get myself in trouble. Today I was doing EXPLAIN plan on a query and I noticed it was not using indexes when I thought it probably should. Well, I kept doing EXPLAIN on simpler and simpler (and more indexable in my mind) queries, until I did EXPLAIN on

select count(*) from table_name

I thought for sure this would return instantly and that the explain would show use of an index, as we have many indexes on this table, including an index on the row_id column, which is unique. Yet the explain plan showed a FULL table scan, and it took several seconds to complete. (We have 3 million rows in this table).

Why would oracle be doing a full table scan to count the rows in this table? I would like to think that since oracle is indexing unique fields already, and having to track every insert and update on that table, that it would be caching the row count somewhere. Even if it's not, wouldn't it be faster to scan the entire index than to scan the entire table?

I have two theories. Theory one is that I am imagining how indexes work incorrectly. Theory two is that some setting or parameter somewhere in our oracle setup is messing with Oracle's ability to optimize queries (we are on oracle 9i). Can anyone enlighten me?

+5  A: 

Oracle does not cache COUNT(*).

MySQL with MyISAM does (can afford this), because MyISAM is transactionless and same COUNT(*) is visible by anyone.

Oracle is transactional, and a row deleted in other transaction is still visible by your transaction.

Oracle should scan it, see that it's deleted, visit the UNDO, make sure it's still in place from your transaction's point of view, and add it to the count.

Indexing a UNIQUE value differs from indexing a non-UNIQUE one only logically.

In fact, you can create a UNIQUE constraint over a column with a non-unique index defined, and the index will be used to enforce the constraint.

If a column is marked as non-NULL, the an INDEX FAST FULL SCAN over this column can be used for COUNT.

It's a special access method, used for cases when the index order is not important. It does not traverse the B-Tree, but instead just reads the pages sequentially.

Since an index has less pages than the table itself, the COUNT can be faster with an INDEX_FFS than with a FULL

Quassnoi
it was the non nullness that was the problem. with the column set to not null, the fast full scan is now happening.
Peter Recore
@Peter: when a query concerns but one column (in `SELECT` clause, `WHERE` clause and `ORDER BY` clause), a `FULL` scan instead of `INDEX_FFS` should ring the bell.
Quassnoi
A: 

Expanding a little on the "transactions" reason. When a database supports transactions, at any point in time there might be records in different states, even in a "deleted" state. If a transaction fails, the states are rolled back.

A full table scan is done so that the current "version" of each record can be accessed for that point in time.

MySQL MyISAM doesn't have this problem since it uses table locking, instead of record locking required for transactions, and caches the record count. So it's always instantlyy returned. InnoDB under MySQL works the same as Oracle, but returns and "estimate".

You may be able to get a quicker query by counting the distinct values on the primary key, then only the index would be accessed.

+2  A: 

It is certainly possible for Oracle to satisfy such a query with an index (specifically with an INDEX FAST FULL SCAN).

In order for the optimizer to choose that path, at least two things have to be true:

  1. Oracle has to be certain that every row in the table is represented in the index -- basically, that there are no NULL entries that would be missing from the index. If you have a primary key this should be guaranteed.
  2. Oracle has to calculate the cost of the index scan as lower than the cost of a table scan. I don't think it necessarily true to assume that an index scan is always cheaper.

Possibly, gathering statistics on the table would change the behavior.

Dave Costa
the stats are from last week, and the table has been at 3 million rows for a long time now, so that's one thing i think i can eliminate. I am going to experiment with marking the row_id column not null - i just checked, and despite row_id looking like and being used as a primary key, the vendor did not specify "not null" on it!
Peter Recore
After altering the supposed primary key column to "not null", oracle does a fast full scan!
Peter Recore