views:

63

answers:

3

If I have two columns, one with very high cardinality and one with very low cardinality (unique # of values), does it matter in which order I group by?

Here's an example:

select 
     dimensionName, 
     dimensionCategory, 
     sum(someFact)
from SomeFact f
join SomeDim d on f.dimensionKey = d.dimensionKey
group by 
    d.dimensionName,  -- large number of unique values
    d.dimensionCategory -- small number of unique values

Are there situations where it matters?

+2  A: 

No, the order doesn't matter for the GROUP BY clause.

MySQL and SQLite are the only databases I'm aware of that allow you to omit columns from the group by (non-standard, not portable) but the order doesn't matter there either.

OMG Ponies
+2  A: 

SQL is declarative.

In this case, you have told the optimiser how you want the data grouped and it works out how to do it.

It won't evaluate line by line (procedural) and look at one column first

The main place column order matters is for indexes. col1, col2 is not the same as col2, col1. At all.

gbn
A: 

There's a legacy, non-standard feature of Microsoft SQL Server called ROLLUP. ROLLUP is an extension to the GROUP BY syntax and when it is used the order of the GROUP BY columns determines which columns should be grouped in the result. ROLLUP is deprecated however. The standard SQL alternative is to use grouping sets, which is supported by SQL Server 2008 and later versions.

dportas