tags:

views:

62

answers:

3

Generally speaking, for combining a lot of data is it better to use a temp table/temp variable as a staging area or should I just stick to "UNION ALL"?

Assumptions:

  • No further processing is needed, the results are sent directly to the client.
  • The client waits for the complete recordset, so streaming results isn't necessary.
+4  A: 

I would stick to UNION ALL. If there's no need to do intermediary processing, thus requiring a temp table, then I would not use one.

Inserting data into a temp table (even if it's a table variable which despite the myths, is not a purely "in memory" structure) will involve work in tempdb (which can be a bottleneck). To then just SELECT * as-is and return it without any special processing is unnecessary and I think bloats the code. When you just need to return data without any special processing, then a temp table approach seems a bit "round the houses". If I thought there was a reason to justify the use of a temp table, I would run some like-for-like performance tests to compare with vs without temp tables - then compare the stats (duration, reads, writes, CPU). Doing actual performance tests is the best way to be as confident as possible that whatever approach you choose, is the best. Especially as you don't have to be using temp tables for there to be work pushed over into tempdb - i.e. depending on your queries, it might involve work in tempdb anyway.

To clarify, I'm not saying one is better than the other full stop. As with most things, it depends on scenario. In the scenario described, it just sounds like you'd be adding in an extra step which doesn't seem to add any functional value and I can't see you'd gain anything other than creating a slightly more complicated/lengthy query.

AdaTheDev
Gut feeling or can you back it up with theory?
Jonathan Allen
@Jonathan Allen - updated with a bit more on my thoughts
AdaTheDev
Just to be clear, I'm looking for a good default. When it really, really matters I still intend to try both.
Jonathan Allen
+2  A: 

One advantage with temp tables i can think of is that you can apply indexes to them. So that should help when dealing with lots of data where you need to get results back as quick as possible.

kevchadders
Agree with the point about you can add indexes to temp tables, but if the underlying tables that are being queried to populate the temp table don't have (appropriate) indexes on then you've still got the performance problem of the initial queries - you're just adding another step onto the operation (the population of the temp table, the creation of the index then the returning of data as-is from the temp table)
AdaTheDev
This would not necessarily be an advantage. If you are inserting into a table that will only ever be selected from once the time saved selecting from the table would be spent building up the indexes when inserting into the table. Better to just apply the order by and selects in the original query.
Mongus Pong
One major difference I've seen so far is that temp tables give you several small execution plans instead of one big one. That does make it easier to fix missing table indexes.
Jonathan Allen
+1  A: 

Not specific to union all..

Use of temp table might have an advantage from a concurrency POV depending on query, isolation level and performance of clients/net link where use of a temp table could serve to minimize read lock times. Just don't use SELECT ..INTO.. to create the table.

In the general case UNION ALL avoids overhead of an unecessary work table.

Einstein