views:

935

answers:

3

I am not interested in the contents of a row, I just want to know if a row exists. The Name column is a primary key, so there will either be 0 or 1 matching rows. Currently, I am using:

if ((from u in dc.Users where u.Name == name select u).Count() > 0)
    // row exists
else
    // row doesn't exist

While the above works, it does a lot of unnecessary work by selecting all the contents of the row (if it exists). Does the following create a faster query:

if (dc.Users.Where(u => u.Name == name).Any())

...or is there an even faster query?

+8  A: 

The Count() approach may do extra work, as (in TSQL) EXISTS or TOP 1 are often much quicker; the db can optimise "is there at least one row". Personally, I would use the any/predicate overload:

if (dc.Users.Any(u => u.Name == name)) {...}

Of course, you can compare what each one does by watching the TSQL:

dc.Log = Console.Out;
Marc Gravell
Extra upvote for you just because I never realized there was an SQL logging feature for Linq to Sql. I've been having to run SQL Profiler this whole time.
David
@David - indeed. I keep asking the MS data team to add it for EF ;-p
Marc Gravell
A: 

I think:

if (dc.Users.Any(u => u.Name == name)) {...}

is the best approach..

Regards!!

MRFerocius
A: 

I do not agree that selecting top 1 will always outperform select count for all SQL implementations. It's all implementation dependent, you know. Curiously, even the nature of data stored in a particular database also affects the overall outcome.

Let's examine both of them the way I would implement them if I were to do so: For both cases, the projection (WHERE clause) evaluation is a common step.

Next for select top 1, you will have to do a read of all fields (unless you did select top 1 'x' eg: select top 1 1). This will be functionally equivalent to IQueryable.Any(...)., except that you will spend some time flashing in the value for each column of the first encountered record if EXISTS. If SELECT TOP is found in the statement, the projection is truncation if there's no post-projection proc (eg ORDER BY clause). This preprocess incurs a small cost but this is extra cost if no record does exist, in which case, a full project is still done.

For select count, the preprocess is not done. A projection is done and if EXISTS is false, the result is instant. If EXISTS is true, the count is still fast because it will be a mere dW_Highest_Inclusive - dW_Lowest_Exclusive. As quick as 500 - 26. If exists is false, the result is even more instant.

The remaining case therefore is: How fast is the projection and what do you loose by doing full projection? And the answer leads to the most crucial issue here which is: is the [NAME] field indexed or not! If you have an index on [NAME] the performance of either query will be so close that it boils down to developer's preference.

By and large, I will simply write two to four linq queries and output difference in time before and after.

  1. select count
  2. select top 1
  3. select top 1 1
  4. select any

Repeat all 4 with an nonclustered index on [NAME];

Pita.O