views:

89

answers:

2

Why would the following queries return different result sets?

select count(ml.link_type),mc.conv_string
from MSP_CONVERSIONS mc,MSP_LINKS ml
where ml.PROJ_ID = 4
and mc.STRING_TYPE_ID = 3
and mc.CONV_VALUE *= ml.link_type
group by mc.conv_string

select count(ml.link_type),mc.conv_string
from MSP_CONVERSIONS mc left outer join MSP_LINKS ml on mc.CONV_VALUE = ml.LINK_TYPE
where ml.PROJ_ID = 4
and mc.STRING_TYPE_ID = 3
group by mc.conv_string

The first query returns:

3 FF

10790 FS

0 SF

117 SS

The second query returns:

3 FF

10790 FS

117 SS

Both queries are run against a SQL Server 2008 Standard database. I cannot understand why two different result sets get returned? I thought that *= was shorthand syntax for LEFT OUTER JOIN. I have been looking at this for so long, maybe I missed something small?

Thanks...

+4  A: 

Because your first query is really equivalent to this:

select count(ml.link_type),mc.conv_string
from MSP_CONVERSIONS mc
LEFT JOIN MSP_LINKS ml
    ON ml.PROJ_ID = 4
    and mc.STRING_TYPE_ID = 3
    and mc.CONV_VALUE = ml.link_type
group by mc.conv_string

You've pulled all the conditions up into the join, making it impossible to ever completely filter out any rows from the MSP_CONVERSIONS table. Best to always stick with the full 'LEFT/INNER JOIN' syntax and avoid the confusion.

Joel Coehoorn
Thanks for the answer however the query example you provided returns EVERY mc.conv_string.This query returns things as I needed:select count(ml.link_type),mc.conv_stringfrom MSP_CONVERSIONS mcLEFT OUTER JOIN MSP_LINKS ml ON mc.CONV_VALUE = ml.link_type and ml.PROJ_ID = 4where mc.STRING_TYPE_ID = 3group by mc.conv_stringThanks for your help.
+2  A: 

"*=" isn't so much "shorthand" syntax as old-fashion pre-ANSI OUTER JOIN syntax. Don't use it. Also, in general, if you have "a LEFT OUTER JOIN b ..." in your selection, then adding additional criteria on "b" in the WHERE clause is a bad idea- if it's read as filtering to be applied to the result of the join, then it will discard all rows where there was no match in b--- effectively converting your outer join to an inner join.

This ties into what Joel wrote--- having all the conditions in the "ON" clause means that the filtering is applied at the time of joining, and that's a different result. The ANSI syntax is more explicit.

araqnid
+1 - ANSI-89 syntax is evil. SQL Server actually handles ANSI-92 (explicit join) syntax more performantly in a number of situations, so you're best to stick with explicit joins.
Aaron Alton