views:

1100

answers:

4

Why does the use of temp tables with a SELECT statement improve the logical I/O count? Wouldn't it increase the amount of hits to a database instead of decreasing it. Is this because the 'problem' is broken down into sections? I'd like to know what's going on behind the scenes.

A: 

AFAIK, at least with mysql, tmp tables are kept in RAM, making SELECTs much faster than anything that hits the HD

MattW.
A: 

There are a class of problems where building the result in a collection structure on the database side is much preferable to returning the result's parts to the client, roundtripping for each part.

For example: arbitrary depth recursive relationships (boss of)

There's another class of query problems where the data is not and will not be indexed in a manner that makes the query run efficiently. Pulling results into a collection structure, which can be indexed in a custom way, will reduce the logical IO for these queries.

David B
+1  A: 

I'm going to assume by temp tables you mean a sub-select in a WHERE clause. (This is referred to as a semijoin operation and you can usually see that in the text execution plan for your query.)

When the query optimizer encounter a sub-select/temp table, it makes some assumptions about what to do with that data. Essentially, the optimizer will create an execution plan that performs a join on the sub-select's result set, reducing the number of rows that need to be read from the other tables. Since there are less rows, the query engine is able to read less pages from disk/memory and reduce the amount of I/O required.

Jeremiah Peschka
A: 

There's no general answer. It depends on how the temp table is being used.

The temp table may reduce IO by caching rows created after a complex filter/join that are used multiple times later in the batch. This way, the DB can avoid hitting the base tables multiple times when only a subset of the records are needed.

The temp table may increase IO by storing records that are never used later in the query, or by taking up a lot of space in the engine's cache that could have been better used by other data.

Creating a temp table to use all of its contents once is slower than including the temp's query in the main query because the query optimizer can't see past the temp table and it forces a (probably) unnecessary spool of the data instead of allowing it to stream from the source tables.

Chris Smith
its worth mentioning that normaly a temp table is builed as type MEMORY only if your RAM or your config dont allow such a gib memeory table its using the disk, which slows down temp tables significant.
Rufinus