ansaurus

Question

What is the difference in these two queries as getting two different result set?

Answer 1

+5 A:

These queries should be returning identical results. Please post the complete queries.

Below is a sample demonstrating the same output from the two variations:

Sample:

CREATE TABLE [dbo].[t1](
    [ID] [int] NULL,
    [StatusID] [int] NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[t2](
    [ID] [int] NULL
) ON [PRIMARY]
INSERT INTO t1 (ID, StatusID) VALUES (1, 10)
INSERT INTO t1 (ID, StatusID) VALUES (2, 11)
INSERT INTO t1 (ID, StatusID) VALUES (3, 12)
INSERT INTO t1 (ID, StatusID) VALUES (4, 12)
INSERT INTO t2 (ID) VALUES (1)
INSERT INTO t2 (ID) VALUES (3)
INSERT INTO t2 (ID) VALUES (5)

SELECT t1.ID, t1.StatusID
FROM t1 
INNER JOIN t2 ON t1.Id = t2.Id 
WHERE t1.StatusId = 12 

SELECT t1.ID, t1.StatusID
FROM t1 
INNER JOIN t2 ON t1.Id = t2.Id 
    AND t1.StatusId = 12

Output:

ID          StatusID
----------- -----------
3           12

(1 row(s) affected)

ID          StatusID
----------- -----------
3           12

(1 row(s) affected)

RedFilter 2010-10-13 18:51:21

they aren't identical...there's a subtle difference in the 2

Leslie 2010-10-13 19:13:49

@Leslie: That is incorrect. For an `INNER JOIN`, the results are identical whether you filter in the `WHERE` clause or `ON` clause.

RedFilter 2010-10-13 19:19:22

Agreed. The difference matters for outer joins, but should not matter for inner.

Nikki9696 2010-10-13 19:28:38

Why the downvote? Vote with facts, not assumptions :)

RedFilter 2010-10-13 19:32:33

I updated my response with a sample case.

RedFilter 2010-10-13 19:36:41

Answer 2

A:

I believe that it is like this:

The first query gets all the results from the joined tables and then filters on the WHERE clause.

The second query gets the results from table 1 where the StatusID = 12 and then performs the join on the result set.

It's a subtle difference.

Edit: Here's a link to an article verifying the difference: http://www.sqlteam.com/article/additional-criteria-in-the-join-clause

Leslie 2010-10-13 19:13:22

how does this subtle difference cause a difference in output?

Jimmy 2010-10-13 19:15:50

Yes, but the question is why don't they return the same results. They appear to the naked eye to request the same information and I would guess from looking at them that they would generate the same execution plan.

Larry Lustig 2010-10-13 19:18:09

-1: Believing is not enough in this case - please provide a sample use case demonstrating the difference in output.

RedFilter 2010-10-13 19:36:23

-1 as well. I've tested multiple tables, and the execution plans are identical regardless of using `WHERE t1.StatusId = 12 ` or adding `AND t1.StatusId = 12` as part of the join. SQL should be smart enough to figure out the execution plans based on the query.

LittleBobbyTables 2010-10-13 19:48:39

From the article you linked to: "Note that there is no difference in the result sets if we use an inner join instead of an outer join, because the inner join removes the non-matching records regardless."

Jon Schoning 2010-10-15 15:57:29

Answer 3

+1 A:

SQL is a declarative language, and it's up to the DBMS to implement your query. The SQL Server query optimiser does not care whether you inner-join two tables using a JOIN expression, a WHERE expression, an IN expression or an EXISTS expression, internally they are all treated the same. SQL Server has 3 different internal join algorithms, which are selected only on the basis of row count and data distribution, not the exact syntax used to declare them. If you compare the execution plans of the two queries in your example you should find that they are identical, and will therefore produce the same resultset.

Christian Hayter 2010-10-13 19:22:17

Answer 4

A:

The difference under the hood is that the WHERE clause is accessed after all the joins.

This means that maybe in this case there is no difference but that would not always be true when there are multiple joins. They could be factored out of the virtual resultsets (depending on the circumstances) before the WHERE is processed.

Kindly see this link: http://blog.sqlauthority.com/2009/04/06/sql-server-logical-query-processing-phases-order-of-statement-execution/

Mike Cheel 2010-10-13 20:09:35

Answer 5

+2 A:

Looking at the edit history of the question it appears that your queries are along the following lines.

Query One

  INNER JOIN dbo.T2
        ON ...
    LEFT OUTER  JOIN dbo.T3
        ON  ...
    WHERE 
     T3.col = somevalue AND ...

Query Two

  INNER JOIN dbo.T2
        ON ...
    LEFT OUTER  JOIN dbo.T3
        ON  ... AND T3.col = somevalue
   WHERE 
        ...

The difference between them is that Query One effectively converts the LEFT Join to an INNER Join.

For a left outer join conceptually the following happens.

T2 is joined onto T3 and the predicate in the join clause is evaluated.
Any non matching rows from T2 are added back in with NULL values for the T3 column values.
The WHERE clause is applied.

None of these rows added back in in step 2 will meet the T3.col = somevalue predicate in step 3 as we know that the value of this column for all these rows is NULL.

Martin Smith 2010-10-13 20:52:52

Answer 6

+1 A:

It really makes no difference when you do this in the INNER JOIN.

However, when you use LEFT or RIGHT JOIN, it does make a difference whether you put the additional filter into the JOIN or into the WHERE clause.

When you put the filter into the WHERE clause, SQL Server does the join first, and then completely filters out the rows where the filter does not fit.
--> this will reduce the number of rows which are returned

When you put the filter into the JOIN, SQL Server does the filtering during the join, but only on the table where you put the filter.
You still get all the rows from the other tables, but only those have the data from the filtered table where the filter fits.
--> this will not reduce the number of rows, but the columns with data from the filteres table will be empty in more rows

It's difficult to explain...to make it more clear, here's an example:

Take the sample data from RedFilter's answer:

CREATE TABLE [dbo].[t1](
    [ID] [int] NULL,
    [StatusID] [int] NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[t2](
    [ID] [int] NULL
) ON [PRIMARY]
INSERT INTO t1 (ID, StatusID) VALUES (1, 10)
INSERT INTO t1 (ID, StatusID) VALUES (2, 11)
INSERT INTO t1 (ID, StatusID) VALUES (3, 12)
INSERT INTO t1 (ID, StatusID) VALUES (4, 12)
INSERT INTO t2 (ID) VALUES (1)
INSERT INTO t2 (ID) VALUES (3)
INSERT INTO t2 (ID) VALUES (5)

...and run the following queries on it:

/* this returns four rows, but only two will have data 
from the second table in the second column */
SELECT t1.ID, t2.ID
FROM t1 
LEFT JOIN t2 ON t1.Id = t2.Id 

/* this returns only one row: the one where t2.ID = 1 */
SELECT t1.ID, t2.ID
FROM t1 
LEFT JOIN t2 ON t1.Id = t2.Id 
WHERE t2.ID = 1 

/* this returns four rows as in the first query, but only one 
row will have data in the second column: the one where t2.ID = 1 */
SELECT t1.ID, t2.ID
FROM t1 
LEFT JOIN t2 ON t1.Id = t2.Id 
AND t2.ID = 1

Note the different results as indicated in the comments.

haarrrgh 2010-10-13 20:54:49

ansaurus

tags:

views:

answers:

What is the difference in these two queries as getting two different result set?

related questions