tags:

views:

114

answers:

3

I have a table with millions of rows, and I need to do LOTS of queries which look something like:

select max(date_field) 
where varchar_field1 = 'something' 
group by varchar_field2;

My questions are:

  • Is there a way to create an index to help with this query?
  • What (other) options do I have to enhance performance of this query?
A: 

Obviously, an index on varchar_field1 will help a lot.

Dmitry
depends on the selectivity of the varchar_field1 whether it would help at all.
Shannon Severance
yes, agree.....
Dmitry
+10  A: 

An index on (varchar_field1, varchar_field2, date_field) would be of most use. The database can use the first index field for the where clause, the second for the group by, and the third to calculate the maximum date. It can complete the entire query just using that index, without looking up rows in the table.

Andomar
+1 FWIW, this is called a *covering index*.
Bill Karwin
And you'd need at least one of those columns to be constrained to "not null", or you could make it a function-based index on (varchar_field1, varchar_field2, date_field,0)
David Aldridge
A: 

You can create yourself an extra table with the columns

  varchar_field1  (unique index)
  max_date_field

You can set up triggers on inserts, updates, and deletes on the table you're searching that will maintain this little table -- whenever a row is added or changed, set a row in this table.

We've had good success with performance improvement using this refactoring technique. In our case it was made simpler because we never delete rows from the table until they're so old that nobody ever looks up the max field. This is an especially helpful technique if you can add max_date_field to some other table rather than create a new one.

Ollie Jones