views:

288

answers:

4

Hi All,

I have a User table where there are a Username and Application columns. Username may repeat but combination of Username + Application is unique, but I don't have the unique constraint set on the table (for performance)

Question: will there be any difference (performance-wise) between :

SELECT * FROM User where UserName='myuser' AND Application='myapp'

AND -

SELECT TOP 1 * FROM User where UserName='myuser' AND Application='myapp'

As combination of Username + Application is unique, both queries will always return no more than one record, so TOP 1 doesn't affect the result. I always thought that adding TOP 1 will really speed things up as sql server would stop looking after it found one match, but I recently read in an article that using TOP will actually slow things down and it's recommended to avoid, though they haven't explained why.

Any comments?

Thank you! Andrey

+7  A: 

If there's more than one row in the results and no ORDER BY clause, the "TOP 1" saves a ton of work for the server. If there's an order by clause the server still has to materialize the entire result set anyway, and if there's only one row it doesn't really change anything.

Joel Coehoorn
It will always return one or no rows as combination of Username and Application is unique. So do I understand correctly that TOP 1 will make absolutely no difference in my case above?
Andrey
I wouldn't say "zero". Query optimizers are finicky things, so I'd throw some weasel words in there like "probably trivial if any".
Joel Coehoorn
It will make all the difference int he world for some values, none for other. When you ask for the user and app that happen to be the first one in the physical order in the table the TOP 1 will seem miracoulos as it will stop the scan after one record. When you ask for the user and app near the end of the b-tree the TOP 1 won't make any difference. What you really need are covering indexes.
Remus Rusanu
+2  A: 

I think it depends on the query execution plan that SQL generates ... In the past on previous versions of SQL Server I have seen the use of a superfluous 'TOP' deliver definite performance benefits with complex queries with many joins. But definitely not in all cases.

I guess the best advice I can give is to try it out on a case by case basis.

codeulike
+1  A: 

you say you do not enforce the constraint, that translates there is no unique index on (UserName, Application) or (Application, UserName). Can the query use an access path that seeks either on UserName or Application? In other words, is any of these two columns indexed? If yes, then the plan will pick the most selective one which is indexed and do a range scan, possibly a nested loop with a bookmark lookup if the index is non-clustered, then a filter. Top 1 will stop the query after the first filter is matched, but whether this makes a difference depends on the cardinality of the data (how many records the range scan finds and how many satisfy the filter).

If there is no index then it will do a full clustered scan no matter what. Top 1 will stop the scan on first match, whether this is after processing 1 record or after processing 999 mil. records depdends on the actual user name and application...

The only thing that wil make a real difference is to allow the query to do a seek for both values, ie. have a covering index. The constraint would be enforced through exactly such a covering index. In other words: by turning off the constraint, presumably for write performance, be prepared to pay the price at reads. Is this read important? Did you do any measurement to confirm that the extra index write of the constraint would be critically dampening the performance?

Remus Rusanu
+3  A: 

You may get some performance difference from just using top, but the real performance you get by using indexes.

If you have an index for the UserName and Application fields, the database doesn't even have to touch the table until it has isolated the single record. Also, it will already know from the table statistics that the values are unique, so using top makes no difference.

Guffa
Creating an index on the two columns is a good idea, but I don't want to create a unique index, as it will slow down inserts, which are a critical part of the app.
Andrey
Do you know that switching the index to unique will slow down the inserts (over a non-unique index) or are you just suspecting that it will? If you assume uniqueness, you should back it up with a unique index.
Rob Garrison
A unique index will slow down inserts as it has to check for existance before inserting, but the difference should be quite small. You may get somewhat better read performance from a unique index, but the main advantage is that it ensured the integrity in the table.
Guffa
Guffa wrote, "A unique index will slow down inserts as it has to check for existence before inserting, but the difference should be quite small." Agreed. The difference is likely to be so small as to not be measurable. Guaranteeing the uniqueness is worth whatever tiny difference.
Rob Garrison