views:

254

answers:

3

Do you need to create an index for fields of group by fields in an Oracle database?

For example:

select * 
from some_table
where field_one is not null and field_two = ?
group by field_three, field_four, field_five

I was testing the indexes I created for the above and the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?

+3  A: 

It could be correct, but that would depend on how much data you have. Typically I would create an index for the columns I was using in a GROUP BY, but in your case the optimizer may have decided that after using the field_two index that there wouldn't be enough data returned to justify using the other index for the GROUP BY.

Eric Petroelje
+1 for mentioning the optimizer, it's probably the most likely cause of this.
Ben S
Thank you for the response. I didn't realize the explain plan is dependent on the amount of data in the table. There is currently no data in the table, which explains why the optimizer may have skipped the other indexes. On another note, would a composite index on just field_three and field_four without field_five still be used in query like the one above? This would not include all the fields in the group by clause.
jigjig
@Mark - that explains it. See my edits for more info about what indexes Oracle could use for the group by.
Eric Petroelje
@Eric - Great thanks man. That is very helpful.
jigjig
@jva - Not sure what you mean about grouping order being relevant to the optimizer. It's only relevant in terms of what indexes can be used. Basicly the order of the columns in the index should be the same as the order of the columns in your group by clause.
Eric Petroelje
@Eric: It seems unlikely to me that the GROUP BY order matters to Oracle. Changing the order of columns in the GROUP BY does not change the semantics of the query and so should not change the plan. Proof?
WW
Eric Petroelje
+1  A: 

No, this can be incorrect.

If you have a large table, Oracle can prefer deriving the fields from the indexes rather than from the table, even there is no single index that covers all values.

In the latest article in my blog:

, there is a query in which Oracle does not use full table scan but rather joins two indexes to get the column values:

SELECT  l.id, l.value
FROM    t_left l
WHERE   NOT EXISTS
        (
        SELECT  value
        FROM    t_right r
        WHERE   r.value = l.value
        )

The plan is:

SELECT STATEMENT
 HASH JOIN ANTI
  VIEW , 20090917_anti.index$_join$_001
   HASH JOIN
    INDEX FAST FULL SCAN, 20090917_anti.PK_LEFT_ID
    INDEX FAST FULL SCAN, 20090917_anti.IX_LEFT_VALUE
  INDEX FAST FULL SCAN, 20090917_anti.IX_RIGHT_VALUE

As you can see, there is no TABLE SCAN on t_left here.

Instead, Oracle takes the indexes on id and value, joins them on rowid and gets the (id, value) pairs from the join result.

Now, to your query:

SELECT  *
FROM    some_table
WHERE   field_one is not null and field_two = ?
GROUP BY
        field_three, field_four, field_five

First, it will not compile, since you are selecting * from a table with a GROUP BY clause.

You need to replace * with expressions based on the grouping columns and aggregates of the non-grouping columns.

You will most probably benefit from the following index:

CREATE INDEX ix_sometable_23451 ON some_table (field_two, field_three, field_four, field_five, field_one)

, since it will contain everything for both filtering on field_two, sorting on field_three, field_four, field_five (useful for GROUP BY) and making sure that field_one is NOT NULL.

Quassnoi
Very interesting - I don't think I've ever seen that before (where Oracle would join two indexes and avoid the table entirely)
Eric Petroelje
`@Eric Petroelje`: there is a special hint, `INDEX_JOIN`, to force this method.
Quassnoi
A: 

Do you need to create an index for fields of group by fields in an Oracle database?

No. You don't need to, in the sense that a query will run irrespective of whether any indexes exist or not. Indexes are provided to improve query performance.

It can, however, help; but I'd hesitate to add an index just to help one query, without thinking about the possible impact of the new index on the database.

...the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?

Not always. Often a GROUP BY will require Oracle to perform a sort (but not always); and you can eliminate the sort operation by providing a suitable index on the column(s) to be sorted.

Whether you actually need to worry about the GROUP BY performance, however, is an important question for you to think about.

Jeffrey Kemp