ansaurus

Question

Avoiding unnecessary sort in SQL Server GROUP BY?

Answer 1

+1 A:

Yes, SQL-Server has always had some problems with this kind of time-partitioning summary SELECTs. Analysis Services has a variety of ways to handle it, but the Data Servies side is more limited.

What I would suggest you try (I cannot try or test anything from here) is to make a secondary "partition table" that contains yor partition definitions and then join against it. You will need some mathcing indexes for his to have a chance to work:

RBarryYoung 2009-08-22 16:31:47

Answer 2

+2 A:

There is really no way SQL Server would know that the date clustered key can be used for an expression like round(cast.. as float)) to guarantee the order. Only that and would throw it off the track. Add in the (... -@min) * @scale and you got yourself a perfect mess. If you need to sort and group by such expressions, have them stored in persisted computed columns and index by them. You probably want to use DATEPART though as going through an imprecise type like float is likely to render the expression unusable for a persisted computed columns.

Update

On the topic of date and float being equivalent:

declare @f float, @d datetime;
select @d = cast(1 as datetime);
select @f = cast(1 as float);
select cast(@d as varbinary(8)), cast(@f as varbinary(8)), @d, cast(@d as float)

Produces this:

0x0000000100000000  0x3FF0000000000000 1900-01-02 00:00:00.000 1

So you can see that altough they are both stored on 8 bytes (at least the float(25...53)), the internal representation of datetime is not a float with integer part being day and fractional part being time (as is often assumed).

To give another example:

declare @d datetime;
select @d = '1900-01-02 12:00 PM';
select cast(@d as varbinary(8)), cast(@d as float)

0x0000000100C5C100  1.5

Again the result of casting @d to float is 1.5, but the datetime internal representation of 0x0000000100C5C100 would be the IEEE double value 2.1284E-314, not 1.5.

Remus Rusanu 2009-08-22 18:11:24

In this example it should be quite easy to analyze at least the (... -@min) * @scale part. Unfortunately storing the "date" column as a float doesn't seem to make a difference.Ultimately though, you are right: it's a bit optimistic to expect SQL Server to solve this automatically. What I'm really hoping for is a way to tell it to assume that the data is already sorted. :)With respect to FLOAT being imprecise, I thought that DATETIME is just a FLOAT internally?

Ben Challenor 2009-08-22 19:03:25

See my update on the date and float 'internal' assumption.

Remus Rusanu 2009-08-22 20:20:08

Ah, that's very interesting! Thanks.

Ben Challenor 2009-08-22 21:04:38

Answer 3

A:

Two questions:

How long does this query take?

And are you sure that it is sorting the date? Also where in the plan is it sorting the date? After it partitions? That would be my guess. I would doubt it's like the first thing it does... Maybe the way it paritions or groups it needs to do a sort again.

Anyways, even if it did sort an already sorted list, it would not think that it would take very long because it is alredy sorted...

kralco626 2010-06-17 11:54:01

ansaurus

tags:

views:

answers:

Avoiding unnecessary sort in SQL Server GROUP BY?

related questions