views:

243

answers:

2

I am working on optimizing one of the SQL Job.

Here I have few places where we have used <> operator. THe same query can be replaced using NOT EXISTS operator. I am just wondering which is better way of doing it.

Sample Query

If(@Email <> (select Email from Members WHERE MemberId = @MemberId)) 
--Do Something.

--Same thing can be written as 
If(NOT EXISTS (SELECT Email FROM Members WHERE MemberId = @MemberId AND Email = @EmailId))

Which is better?

I went through execution plans for both (coundn't attach as all image hosting is blocked in office). I can see <> operator has Assert and Stream Aggregate operations extra than NOT EXISTS. Not sure if they are good or bad or no impact.

+3  A: 

NOT EXISTS is generally better (although in your case if the table is small and/or indexed properly it may not be the case).

Almost always you should use EXISTS/NOT EXISTS for queries in which you're trying to find out whether a certain record exists (or does not exist)!

The reasoning behind is that EXISTS (and NOT EXISTS) queries will stop as soon as the condition is fulfilled (or in the case of NOT EXISTS proven false) as opposed to using sub-queries which will continue to scan records through the whole table.

Miky Dinescu
The subquery above gives just one result. So should there be any difference?
noob.spt
Guess it should be same here. Though I am still lil skeptic about two extra entities (Assert and Stream Aggregate ) shown up in execution plan. I have to do more research on this. Thanks.
noob.spt
+1  A: 

The difference between your two statements lies in the question "how much is done in pure SQL, and how much is done by the engine running the procedures/scripts etc. (I'd like to say what is done by the database and what's outside of the database, but in a stored proc both parts are handled by the database.)

In your example, the first statement uses SQL to fetch one member's Email. The Table access uses what I assume a primary key and its associated unique index, so it should be really fast even for a large table. The EMail is passed to outside of SQL, and the comparison is then done in the script.

In the second statment, pretty much the same happens. The MemberID is again used to access the unique record, then the email is compared and a boolean result is passed back to outside of SQL.

Therefore, the performance for your example should be pretty similar.

There will be different considerations (such as MikyD has noted) when more than one value has to be transferred to outside of SQL and a more complicated comparison has to be done (e.g. selecting a large number of emails using SQL and then doing the comparison in the script with something like Email IN (Select ..) ). Then it would be usually preferable to do as much work as possible in SQL, transfer the least amount of data between SQL and non-SQL and let the database figure out the most effective way to get at the data.

IronGoofy
Thanks IronGoofy!
noob.spt