views:

57

answers:

2

Imagine that we have two tables as follows:

Trades
(
  TradeRef INT NOT NULL, 
  TradeStatus INT NOT NULL,
  Broker INT NOT NULL,
  Country VARCHAR(3) NOT NULL
)

CTMBroker
(
  Broker INT NOT NULL,
  Country VARCHAR(3) NULL
)

(These have been simplified for the purpose of this example). Now, if we wish to join these two tables on the Broker column, and if a country exists in the CTMBroker table on the Country, we have the following two choices:

SELECT T.TradeRef,T.TradeStatus
FROM Trades AS T
JOIN CTMBroker AS B ON B.Broker=T.Broker AND ISNULL(B.Country, T.Country) = T.Country

or

SELECT T.TradeRef,T.TradeStatus
FROM Trades AS T
JOIN CTMBroker AS B ON B.Broker=T.Broker AND (B.COUNTRY=T.Country OR B.Country IS NULL)

These are both logically equivalent, however in this specific circumstance for our database (SQL Server 2008, SP1) two different execution plans are produced for these two queries with the second version significantly outperforming the first version in terms of both time and logical reads.

My question really is as follows: as a general rule would (2) be preferred to (1), or does this just happen to be exploiting some particular idiosyncracy of the optimiser in 2008 SP1 (that could therefore change with future versions of SQL Server).

+1  A: 

Could you try this one too?

SELECT T.TradeRef,T.TradeStatus
FROM Trades AS T
JOIN CTMBroker AS B ON B.Broker=T.Broker AND B.COUNTRY=T.Country
UNION ALL
SELECT T.TradeRef,T.TradeStatus
FROM Trades AS T
JOIN CTMBroker AS B ON B.Broker=T.Broker AND B.Country IS NULL

Often UNION ALL outperforms OR, depending on the ability of the optimizer to optimize the OR. If this is the same speed as your second query it explains why it is so fast: the optimizer is able to see that there are two separate conditions and use the index appropriately.

In the first case because you are using a non-sargable function (ISNULL), the index cannot be used.

Mark Byers
The UNION ALL version performs on a par with the second of the two versions that I'd posted (slighly worse, but only because of a couple of extra logical reads against the Trades table - in terms of execution time there are only a couple of ms difference between the two).As a general solution I'm not sure that UNION ALL is always the answer. As the number of joins increases we would, in effect, have to duplicate certain code decreasing maintainability. Of course, as you point out, UNION ALL certainly has its place in some situations.
Paul McLoughlin
+2  A: 

I think part of the reason is that ISNULL is a function, therefore SQL can't use any indexes on the table, but has to scan through the table evaluating every row. I try to avoid functions in join conditions for this reason.

Jeremy
+1 You never want to use any function or expression on the "left" side of WHERE condition. This does not allow to use indexes and it causes full table scan - in QA, turn on show plan and you will easily see it; if you see table scan anywhere - no good unless it's very small table
IMHO