views:

58

answers:

4

I'm running into a very strange issue that I have found no explanation for yet. With SQL Server 2008 and using the GROUP BY it is ordering my columns without any ORDER BY specified. Here is a script that demonstrates the situation.

CREATE TABLE #Values ( FieldValue varchar(50) )

;WITH FieldValues AS
(
    SELECT '4' FieldValue UNION ALL
    SELECT '3' FieldValue UNION ALL
    SELECT '2' FieldValue UNION ALL
    SELECT '1' FieldValue
)
INSERT INTO #Values ( FieldValue )
SELECT
    FieldValue 
FROM FieldValues

-- First SELECT demonstrating they are ordered DESCENDING
SELECT
    FieldValue
FROM #Values

-- Second SELECT demonstrating they are ordered ASCENDING
SELECT
    FieldValue
FROM #Values
GROUP BY
    FieldValue

DROP TABLE #Values

The first SELECT will return

4
3
2
1

The second SELECT will return

1
2
3
4

According to the MSDN Documentation it states: "The GROUP BY clause does not order the result set"

+4  A: 

If you don't specify an order by clause, SQLServer is free to return the results in any order.

Maybe in your specific query it will return results ordered, but that doesn't mean it will always sort resltsets when using a group by clause.

Maybe (just maybe) it's doing some hash aggregate to compute the group by and the hash table happens to be sorted by 1,2,3,4. And then ir returning rows in the hash order...

Pablo Santa Cruz
+7  A: 

To answer this question, look at the query plans produced by both.

The first SELECT is a simple table scan, which means that it produces rows in allocation order. Since this is a new table, it matches the order you inserted the records.

The second SELECT adds a GROUP BY, which SQL Server implements via a distinct sort since the estimated row count is so low. Were you to have more rows or add an aggregate to your SELECT, this operator may change.

For example, try:

CREATE TABLE #Values ( FieldValue varchar(50) )

;WITH FieldValues AS
(
    SELECT '4' FieldValue UNION ALL
    SELECT '3' FieldValue UNION ALL
    SELECT '2' FieldValue UNION ALL
    SELECT '1' FieldValue
)
INSERT INTO #Values ( FieldValue )
SELECT
    A.FieldValue
FROM FieldValues A
CROSS JOIN FieldValues B
CROSS JOIN FieldValues C
CROSS JOIN FieldValues D
CROSS JOIN FieldValues E
CROSS JOIN FieldValues F

SELECT
    FieldValue
FROM #Values
GROUP BY
    FieldValue

DROP TABLE #Values

Due to the number of rows, this changes into a hash aggregate, and now there is no sort in the query plan.

With no ORDER BY, SQL Server can return the results in any order, and the order it comes back in is a side-effect of how it thinks it can most quickly return the data.

Tadmas
Execution plan does indeed show a DISTINCT SORT when grouping.
Nathan Palmer
Great response! Thanks for the extra modification to show the difference.
Nathan Palmer
+1  A: 

This is a DB optimisation facet. The SQL Engine generally has to parse data based in order of the grouped column. Rather than waste time afterwards it leaves the dataset in whatever order parsed.

There are situations where this will not be the case (especially if using aggregate functions). Using the ORDER BY command obviously overrides this functionality (and consequently implies a tiny bit extra load, not enough to worry about in all but the most extreme cases though).

Goat Master
A: 

You can clean up your insert statements as well here. SQL server 2008 added a new feature for inserts. Example:

Inert into tbl ( clmn ) Values ( 1 ), ( 2 ), ( 3 ) Each record doesn't need it's own insert statement and each is separated by a coma.

Salizar Marxx