views:

64

answers:

3

I have two tables: A and P. I want to get information out of all rows in A whose id is in a temporary table I created, tmp_ids. However, there is additional information about A in the P table, foo, and I want to get this info as well. I have the following query:

SELECT A.H_id AS hid,
       A.id AS aid,
       P.foo, A.pos, A.size
FROM tmp_ids, P, A
WHERE tmp_ids.id = A.H_id
  AND P.id = A.P_id

I noticed it going slowly, and when I asked Postgres to explain, I noticed that it combines tmp_ids with an index on A I created for H_id with a nested loop. However, it hashes all of P before doing a Hash join with the result of the first merge. P is quite large and I think this is what's taking all the time. Why would it create a hash there? P.id is P's primary key, and A.P_id has an index of its own.

UPDATE: All the datatypes are INTEGER, except A.size which is a DOUBLE PRECISION and P.foo which is VARCHAR. I'm using PostgreSQL version 8.4.

Here is the explain: http://explain.depesz.com/s/WBo .

+1  A: 

Without seeing an explain analyze, these kind of problems usually result from statistics being off or an unusual setting required for random_page_cost or seq_page_cost.

It may run better with

set enable_hashjoin = false;
rfusca
+1  A: 

The query planner estimated it'd be faster to sequentially read all the data and hash it, than to perform an estimated 2100 index scans with their associated much more random disk access.

Stephen Denne
A: 

Your problem is that the optimizer doesn't have the right statistics to determine how many matches "A.H_id = tmp_ids.id" is going to create, which is a common problem with temporary tables--they don't have statistics the way a regular one does. It guesses that 21 rows are going to match coming out of the "Index Scan using idx_A_handid on A", but there are actually only 3. It's highlighted in the explain analysis where the lowest level up arrow has a 7 next to it, giving the multiplier for how wrong the estimate was.

That error carries forward to where it thinks it has 2100 rows to scan, at which point it might as well do a full sequential scan and hash the results given that's likely to touch most blocks in the table.

Had it known correctly there were only 300 to probe, it might have done something different involving only a subset of the data. You can't expect to get good plans from joins against temporary tables because of their lack of statistics. This may be a case where it's appropriate to nudge correct behavior by turning off enable_hashjoin before executing the query.

Greg Smith