ansaurus

Question

Answer 1

+6 A:

This is how I would do it

SELECT <<list only columns you need>> 
FROM large_historical_transaction_table t
WHERE EXISTS (SELECT 1 FROM OneRowTable o 
              WHERE o.last_processing_date = t.process_date)

SQLMenace 2010-05-21 14:24:11

I'm sure you know what you're talking about, with a name like SQLMenace and a rep of 17k, but on first glance, it looks like correlated subquery will be executed for each row in the large table?

mdma 2010-05-21 14:27:37

+1, @mdma, it looks that way, but this is faster than the regular join

KM 2010-05-21 14:33:01

Thanks for clarification. I'm interested to get into the details of how it's faster - do you know of links describing this in more detail?

mdma 2010-05-21 15:02:12

@mdma: Generally speaking, a `JOIN` implies returning **all** matching rows whereas `EXISTS` only needs to find the first one. Although in this case it doesn't matter because there's only one row in the lookup table to begin with.

Aaronaught 2010-05-21 15:15:08

+1, even though I don't get it! @Aaronaught - I kind of see where you are coming from, but here there is only one row, so what's the actual difference in practice?

mdma 2010-05-21 15:17:23

I just tried this in SQL server. Using an INNER JOIN and the subselect - the actual query plans are identical, so the subquery wasn't faster.

mdma 2010-05-21 15:22:49

@mdma: With only one row, there probably is no difference in practice. Still, it's good to get into the habit of using `IN` or `EXISTS` when you don't actually need a true `JOIN` - oftentimes (but not always) it will lead to a more efficient plan. Especially on, shall we say, *differently-abled* DBMSes like mysql.

Aaronaught 2010-05-21 15:24:23

Differently-abled! hehe! :-) Thanks for the advice, I'll keep that in mind. I kind of have a wired in knee jerk reaction when relating two tables to use a join, I'll have to unlearn that habit.

mdma 2010-05-21 15:48:01

Answer 2

+2 A:

An Inner join is a symmetrical, bi-directional relationship, in general it doesn't matter, but in this case I would suggest not joining at all, read the threshold date into a variable and pass it to the other select query as a parameter...

Charles Bretana 2010-05-21 14:24:23

Answer 3

+2 A:

For readability I would inner join from the transaction table to explicitly indicate that the second table with the date just acts like a filter.

Raj 2010-05-21 14:25:49

Answer 4

+2 A:

When joining tables, the query optimizer takes a quick sniff of both to determine the most appropriate join implementation. Logically the inner join is symmetric, but the implementation may favor one side over the other for improved performance.

mdma 2010-05-21 14:29:34

Answer 5

A:

A cross join will do it:

SELECT t.col1, t.col2, p.process_date
FROM Transactions t, Process p;

dportas 2010-05-21 15:00:13

@David. No it wouldn't. It would return **all** records from Transactions. That is not what OP wants.

Lieven 2010-05-21 15:06:32

Ah, missed that. In that case it is just an inner join. Thanks for the correction.

dportas 2010-05-21 17:59:15

If you change your answer I can do something about that -1.

Keng 2010-05-23 13:28:53

ansaurus

tags:

views:

answers:

SQL Query Theory Question...

related questions