views:

55

answers:

4

What are examples of SQL Server Query features/clauses that should be avoided?

Recently I got to know that the NOT IN clause degrades performance badly.

Do you have more examples?

+5  A: 

The reason to avoid NOT IN isn't really performance, it's that it has really surprising behaviour when the set contains a null. For example:

select 1
where 1 not in (2,null)

This won't return any rows, because the where is interpreted like:

where 1 <> 2 and 1 <> null

First 1 <> null evaluates to unknown. Then 1 <> 2 and unknown evaluates to unknown. So you won't receive any rows.

Andomar
Solution: Ban `NULL`. At least that's what some SOians are preaching! +1.
Aaronaught
A: 

I Recently changed a view from

Select Row1, Row2 FROM table Where blahID = FKblahID
UNION
Select Row1, Row2 FROM table2 Where blah2ID = FKblahID

to just

Select Row1, Row2 FROM table Where blahID = FKblahID

and saw a query that was taking ~8 mins run now only took ~20 secs not 100% sure why such a big change.

The second union was only returning about 200 records also, while the first was returning a couple thousand.

corymathews
How is that a feature/clause to be avoided?
OMG Ponies
@OMG Ponies: Obviously you haven't heard about the cool new feature that lets you slam a bunch of `UNION` statements together instead of changing your `WHERE` condition, at the mere cost of performance and readability. You should try it. Or maybe he didn't realize that `UNION` implies `DISTINCT`...
Aaronaught
@Aaronaught: It's a poorly written query, and the update risks omitting data that was previously returned.
OMG Ponies
@OMG Ponies: Sorry, Markdown doesn't have a mapping to `<sarcasm>` yet. I tried. :P
Aaronaught
@Aaronaught: My sarcasm meter is busted today :/
OMG Ponies
@OMG Ponies Look at the amound of difference in query times. From over 8 mins to around 20 seconds, thats why I would avoid a UNION. However it may not always be an option. I was able to avoid it by changing the requirements. I was able to change what was needed in the report and thus not have to query the second table. Not always an option though.
corymathews
@Aaronaught How would you write such a query better when you must query 2 tables and return data for each? I see it as a necessity if the requirements need that data. I am interested in how this would no longer be poorly written..
corymathews
@corymathews: It's a PEBKAC error, not an inherent SQL Server functionality issue.
OMG Ponies
@corymathews: You probably wanted a `UNION ALL`. I don't see how the result sets of two separate tables could combine to create duplicates (unless there's something wrong with the design).
Aaronaught
+2  A: 

I avoid correlated subqueries (noncorrelated subqueries and derived tables are OK) and any cursors that I can avoid. Also avoid while loops if you can. Think in terms of sets of data not row-by-row processing.

If you are using a UNION, check to see if UNION ALL will work instead. There is a potential results difference, so make sure before you make the change.

I always look at the word DISTINCT as a clue to see if there is a better way to present the data. DISTINCT is costly compared to using a derived table or some other method to avoid it's use.

Avoid the implied join syntax to avoid getting an accidental cross join (which people often fix, shudder, with distinct). (Generating a list of 4 million records and then distincting to get the three you want is costly.)

Avoid views that call other views! We have some folks who designed one whole client database that way and performance is HORRIBLE! Do not go down that path.

Avoid syntax like WHERE MyField Like "%test%'

That and other non-saragable where clauses can keep the optimizer from using indexes.

HLGEM
What's the alternative to `LIKE '%test%'`? Performance is obviously lousy, but if you actually need to search for a substring...
Aaronaught
@Aaronaught: Full Text Search (FTS).
OMG Ponies
@HLGEM: +1, and I'm thankful that SQL Server will return an error if you attempt to define an `ORDER BY` in a view without using `TOP` ::shivers::
OMG Ponies
Besides fulltext search, the solution is often to normalize the design and stop putting together things in one field that belong in a related table. Or require the user to give you the first letter of the string. Or do a search without the wildcards and only do the wildcard seach if the first search returns no records. The first search will be fast and users will learn that if they type in the whole word they will find things faster and there will be less work for your server to do most of the time and more work only occasionally.
HLGEM
@HLGEM: Fulltext search can't handle substrings, so what happens when someone actually needs to *search* for *all records* containing some text? It's a common enough requirement, even with a normalized database. The fallback approach is a nice idea, but the behaviour is wrong when *some* of the results begin with the search string but others only contain it.
Aaronaught
@OMG Ponies: FTS is not an alternative to `LIKE '%test%'`. It is an alternative to `LIKE 'test%`' if you use prefix search; otherwise it can only find whole words, which is not always what a user needs to search for.
Aaronaught
@Aaronaught: CONTAINS will return any row where the supplied string/phrase exists within it: http://msdn.microsoft.com/en-us/library/ms187787.aspx
OMG Ponies
@OMG Ponies: `CONTAINS` can only find a word in a phrase, not part of a word (unless it is also the beginning of said word).
Aaronaught
At any rate, these are things to look for that might contain performance issues, that doesn't mean they can be replaced 100% of the time. If that were true we wouldn't need these abilities.
HLGEM
+2  A: 

Avoid

CURSOR - use set based ops

SELECT * - name you columns explicitly

EXEC(@dynamic_sql_with_input_parms) - use sp_executesql with input paramters.

Sam