ansaurus

Question

SQL Group By Year, Month, Week, Day, Hour SQL vs Procedural Performance

Answer 1

+2 A:

As with anything performance related Measure

Checking the query plan up for the second approach will tell you any obvious problems in advance (a full table scan when you know one is not needed) but there is no substitute for measuring. In SQL performance testing that measurement should be done with appropriate sizes of test data.

Since this is a complex case, you are not simply comparing two different ways to do a single query but comparing a single query approach against a iterative one, aspects of your environment may play a major role in the actual performance.

Specifically

the 'distance' between your application and the database as the latency of each call will be wasted time compared to the one big query approach
Whether you are using prepared statements or not (causing additional parsing effort for the database engine on each query)
whether the construction of the ranges queries itself is costly (heavily influenced by 2)

ShuggyCoUk 2009-01-27 10:44:26

Answer 2

A:

I think that you should benchmark it to get reliable results , but, IMHO and my first thought would be that letting the DB take care of it (your 2nd approach) would be much faster then when you do it in your client code. With your first approach, you have multiple roundtrips to the DB, which I think will be far more expensive. :)

Frederik Gheysels 2009-01-27 10:44:57

Answer 3

+3 A:

If you put a formula into the field part of a comparison, you get a table scan.

The index is on field, not on datepart(field), so ALL fields must be calculated - so I think your hunch is right.

Galwegian 2009-01-27 10:47:11

There's no WHERE clause, so you're going to get a table scan anyway since it's going to look at every row.

Joe 2009-01-27 15:34:58

Answer 4

+1 A:

you could do something similar to this:

SELECT Sum(someValues)
FROM 
(
    SELECT *, Year(deliveryDate) as Y, Month(deliveryDate) as M, Day(deliveryDate) as D
    FROM table1
    WHERE deliveryDate BETWEEN @fromDate AND @ toDate
) t
GROUP BY Y, M, D

Mladen Prajdic 2009-01-27 11:16:16

Answer 5

+2 A:

If you can tolerate the performance hit of joining in yet one more table, I have a suggestion that seems odd but works real well.

Create a table that I'll call ALMANAC with columns like weekday, month, year. You can even add columns for company specific features of a date, like whether the date is a company holiday or not. You might want to add a starting and ending timestamp, as referenced below.

Although you might get by with one row per day, when I did this I found it convenient to go with one row per shift, where there are three shifts in a day. Even at that rate, a period of ten years was only a little over 10,000 rows.

When you write the SQL to populate this table, you can make use of all the date oriented built in functions to make the job easier. When you go to do queries you can use the date column as a join condition, or you may need two timestamps to provide a range for catching timestamps within the range. The rest of it is as easy as working with any other kind of data.

Walter Mitty 2009-01-27 11:59:36

Answer 6

A:

You may want to look at a dimensional approach (this is simliar to what Walter Mitty has suggested), where each row has a foreign key to a date and/or time dimension. This allows very flexible summations through the join to this table where these parts are precalculated. In these cases, the key is usually a natural integer key of the form YYYYMMDD and HHMMSS which is relatively performant and also human readable.

Another alternative might be indexed views, where there are separate expressions for each of the date parts.

Or calculated columns.

But performance has to be tested and execution plans examined...

Cade Roux 2009-01-27 15:08:51

Answer 7

A:

I was looking for similar solution for reporting purposes, and came across this article called Group by Month (and other time periods). It shows various ways, good and bad, to group by the datetime field. Definitely worth looking at.

alextansc 2009-03-16 15:46:00

ansaurus

tags:

views:

answers:

SQL Group By Year, Month, Week, Day, Hour SQL vs Procedural Performance

related questions