views:

171

answers:

3

I have a table T1 with 60 rows and 5 columns: ID1, ID2, info1, info2, info3.

I have a table T2 with 1.2 million rows and another 5 columns: ID3, ID2, info4, info5, info6.

I want to get (ID1, ID2, info4, info5, info6) from all the rows where the ID2s match up. Currently my query looks like this:

SELECT T1.ID1, T2.ID2,
       T2.info4, T2.info5, T2.info6
  FROM T1, T2
 WHERE T1.ID2 = T2.ID2;

This takes about 15 seconds to run. My question is - should it take that long, and if not, how can I speed it up? I figure it shouldn't since T1 is so small.

I asked PostgreSQL to EXPLAIN the query, and it says that it hashes T2, then hash joins that hash with T1. It seems hashing T2 is what takes so long. Is there any way to write the query so it doesn't have to hash T2? Or, is there a way to have it cache the hash of T2 so it doesn't re-do it? The tables will only be updated every few days.

If it makes a difference, T1 is a temporary table created earlier in the session.

+7  A: 

It should not take that long :)

Creating an index on T2( ID2 ) should improve the performance of your query.

Peter Lang
nice now it's blazing fast =).
Claudiu
i added indices to the rest of the database too, and now i can add new items to it 50 times/second instead of.. once every 2 seconds. sweeet!
Claudiu
A: 

May be using JOIN increase speed of query:

SELECT T1.ID1, T2.ID2,
    T2.info4, T2.info5, T2.info6
FROM T1
JOIN T2 ON T2.ID2 = T1.ID2;

I don't know exactly but may be your query firstly join all row in both table, and after that apply WHERE conditions and it's problem.

And of course, as Peter Lang saw, you should create index.

Pavel Belousov
i thought of this already, but it didnt make a difference. i think my way is syntax sugar for this way.
Claudiu
So I was not right :) But always when I want to join tables I use JOIN instead your vaiant ;)
Pavel Belousov
The JOIN syntax was standardised much later than the traditional join-conditions-in-the-where-clause usage. I can remember them being newly-supported in Oracle 8 or so. IMHO a case of a new standard actually making things better (esp. outer joins) for once.
araqnid
The use of explicit joins is hardly a new thing, they have been outdated on most systems for 18 years. The use of implicit joins is a bad thing - they are harder to maintain and easier to make mistakes with that the syntax won't catch (accidental cross joins) and horrible when you need left or right joins (some databases don't support the implicit left join syntax properly at all). It is a poor programming practice to ever use an implicit join. There is no excuse to use them now as all databases support explicit joins.
HLGEM
A: 

First, a make a join.

SELECT T1.ID1, T2.ID2,
       T2.info4, T2.info5, T2.info6
  FROM T1
  JOIN T2 ON T1.ID2 = T2.ID2;

Then try creating and index on T2.d2.

If not, if possible, you can add ID1 column to T2. Update it accordingly every few days as you claim. Then it just a simple query on T2 with no joins.

SELECT T2.ID1, T2.ID2,
       T2.info4, T2.info5, T2.info6
  FROM T2 
  WHERE T2.ID2 = A_VALUE;

Again, an index on T2.ID2 will be recommended.