views:

397

answers:

5
+3  Q: 

Index a sum column

Is creating an index for a column that is being summed is faster than no index?

+5  A: 

No. Indexes improve searches by limiting how many checks are required. An aggregate function (count, max, min, sum, avg) has to run through all the entries in a column regardless.

Spencer Ruport
+10  A: 

Sorry, it is not clear what you are asking.

Are you asking, would it speed up a query such as

SELECT product, sum(quantity) FROM receipts 
GROUP BY product

if you added an index on quantity?

If that is the question, then the answer is no. Generally speaking, indexes are helpful when you need to find just a few rows among many; here you need all rows, so an index does not help.

There is an obscure exception (which applies so rarely most DB optimizers probably don't bother implementing this trick). If your query happens to be

SELECT sum(foo) FROM bar

, where there is an index on foo, and bar is a table with many columns, it is possible to read in the full index, incurring a smaller hit than if you read the underlying table, and get the answer directly from the index -- never having to touch the "real" table at all! This is a fairly rare case, however, and you will want to test that your optimizer knows to do this before relying on this too much.

SquareCog
+1 because it's an interesting use of indexes.
Chris Lively
+1 Good advice: to view the execution plan produced by the optimizer.
David B
+1  A: 

If you want to make the summation faster, you can pre-materialized the result. On Oracle, use Materialized Views, on MS SQL use Indexed Views.

On your specific question "Is creating an index for a column that is being summed is faster than no index?", answer is No.

The answer to your question lies on Spencer's answer:

"An aggregate function (count, max, min, sum, avg) has to run through all the entries in a columns being summed regardless."

Just clarified the context of columns in Spencer's answer. His answer is correct nonetheless.

Michael Buen
A: 

If the index is covering, it will generally be faster. How much faster will be determined by the difference between the number of columns in the table versus the number in the index. In addition, it might be faster if there are any filtering criteria.

Cade Roux
A: 

I found indexing a column in the where(productid here) helps when using this query:

SELECT productid, sum(quantity) FROM receipts WHERE productid = 1 GROUP BY productid

One of my queries went from 45 seconds to almost instant once I added the index.

With a single product ID, do you need the product ID in the SELECT list?
Jonathan Leffler