tags:

views:

70

answers:

3

My boss is asking me to code a report that has the following components:

  • A pie chart of employee count by state
  • A pie chart of employee count by age bracket (10 year brackets)
  • A pie chart of employee length of service (5 year brackets)
  • A pie chart of employee Male/Female breakdown
  • A pie chart of employee count by salary band (computer generates brackets).

There may be others.

I know I can do this by writting 5 different sql statements. However it seems like this would generate 5 table scans for one report.

I could switch gears and do one table scan and analyse each record on the front end and increment counters and probably accomplish this with one-pass.

Which way would the collective wisdom at stackoverflow go on this?

Is there a way to accomplish this with the CUBE or ROLL UP clauses in T-SQL?

+2  A: 

if you want 5 pie charts and need to summarize then you need 5 SQL statements since your WHERE clause is different for each

SQLMenace
Not true. I could pull all the relevent records over to the front end, and then build the statistic in a foreach () loop.
Aheho
I am talking about pure SQL, in your case you would have to group by all the different criteria and bring back how many rows?
SQLMenace
Currently the largest report would have to summerized 30K-40K rows.
Aheho
are you sure that 5 SQL statements would not be faster (assuming all the WHERE clauses are properly covered)? than pulling everything over the NET (this might take a while) and then looping over all these rows?
SQLMenace
I'm not sure which is faster. I could do both and time it, but that would involve doing twice as much work. In addition, if my boss adds a new summarization category, the 2nd method wouldn't need any data access changes.
Aheho
+2  A: 

If your data is properly indexed, those reports may not require any table scans at all.

Really, for a problem like this you should code up the reports the simple way, and then see whether the performance meets the business requirements. If not, then look at optimisation strategies.

Greg Hewgill
I concede your second point. However I challenge you to come up with an index scheme that allows me to get stats on "Employee Age" in 10 year bands that would avoid a table scan.
Aheho
@Aheho in order to summarize you need to scan, if you do male/female then the index is also worthless because of very low selectivity
SQLMenace
+1  A: 

You may have some performance gains by storing intermediate results in a table variable or temp table, then running more aggregation against it. Example with only two result sets:

SELECT COUNT(*) as cnt, State, AgeBracket 
INTO #t
FROM YourTable
GROUP BY State, AgeBracket;

SELECT SUM(cnt) AS cnt, State FROM #t GROUP BY State;
SELECT SUM(cnt) AS cnt, AgeBracket FROM #t GROUP BY AgeBracket;
AlexKuznetsov
Hey Alex, nice to see you here :-) Keep in mind that when doing (I know you know this already) a SELECT INTO against a large table you will get locking
SQLMenace