views:

721

answers:

7

If I run the following SQL query

SELECT * FROM A
LEFT JOIN B
ON A.foo=B.foo
WHERE A.date="Yesterday"

Does the WHERE statement get evaluated before or after the join?

If after, what would be a better way to write this statement so that only rows in A from "Yesterday" are joined to B.

+5  A: 

depends on the database

on sql server, run: SET SHOWPLAN_ALL ON then run the query, you will get an idea of what happens when it runs.

KM
+1  A: 

Depends on indexes and statistics.

You should show the execution path of the query to determine where (if any) optimizations should be applied.

John Gietzen
+3  A: 

Semantically: After the JOIN. But in this case, there is no difference in timing, because it's on the LEFT side of the JOIN.

As you already have it, "only rows in A from "Yesterday" are joined to B".

The optimizer is free to reorganize its order of operations depending on the equivalences in the relational algebra.

This returns only A.date="Yesterday" and joins B where it can find a match on foo:

SELECT * FROM A
LEFT JOIN B
    ON A.foo=B.foo
WHERE A.date="Yesterday"

This returns all A regardless of any criteria and joins B where A.date="Yesterday" AND it finds a match on foo:

SELECT * FROM A
LEFT JOIN B
    ON A.foo=B.foo
    AND A.date="Yesterday"
Cade Roux
Just to understand, are you saying that putting the criteria in the WHERE clause is better from a performance perspective than putting it inside the JOIN clause?
Chris Lively
It makes no difference to performance on an INNER JOIN and that is why the optimizer has more options to move things around on INNER JOINs. On an OUTER JOIN (LEFT or RIGHT), it affects the semantics of the declaration, so performance is not usually the issue.
Cade Roux
+6  A: 

Your idea of "evaluation" is not correct as SQL is a declarative language.

BTW you can see the query execution plan. In MySQL prefix your query with keyword describe to see the execution plan.

Tahir Akhtar
+1  A: 

The order of operations to satisfy a query is determined why the whim of the particular database's query optimizer. A query optimizer tries to product a good "query plan" (set of operations) based on what it can glean from the query and whatever statistics it has on hand about the database (which could include the cardinality of tables and certain distributions of data).

In your case, the answer may depend on whether you have a secondary index on A.date

Query optimization a fairly rich topic. The documentation for whatever database you're using will have a lot more to say about it.

Dave W. Smith
A: 

in SQL Server:

As a general rule of thumb, JOIN clauses are evaluated before WHERE clauses.

In case of complex joins that need filters in the join part, I write them along with my join

SELECT *
FROM A
LEFT JOIN B
    ON A.Foo1 = B.Foo1
    And A.Date = 'Yesterday'
OUTER JOIN C
    ON B.Foo2 = C.Foo2
JOIN D
    ON B.Foo3 = D.Foo3
Raj More
This does not have the same meaning when you use it inside the LEFT JOIN. In the LEFT JOIN, when the criteria aren't satisfied, the LEFT rows are still returned, and any columns from the RIGHT are NULL.
Cade Roux
I stand corrected on the example provided. I was trying to make the point of putting the where part of a clause along with the appropriate join. I have corrected the statement.
Raj More
:)i stand recorrected.. i can NOT correct the statement.
Raj More
A: 
SELECT * 
FROM (SELECT * FROM A WHERE Date = 'Yesterday') A
LEFT JOIN B 
    ON A.Foo1 = B.Foo1 
OUTER JOIN C 
    ON B.Foo2 = C.Foo2 
JOIN D 
    ON B.Foo3 = D.Foo3 
Roger Willcocks