ansaurus

Question

Count(*) vs Count(Id) in sql server 2005

Answer 1

+3 A:

count(id) needs to null-check the column (which may be optimized away for a primary key or otherwise not-null column), so count(*) or count(1) should be prefered (unless you really want to know the number of rows with a non-null value for id).

Thilo 2010-02-08 05:56:41

Answer 2

+3 A:

Thilo nailed the difference precisely... COUNT( column_name ) can return fewer than COUNT( * ) if column_name values can be NULL.

However if I can take a slightly different angle at answering your question, since you seem to be focusing on performance.

First, note that issuing SELECT COUNT(*) FROM table will potentially block writers, and it will also be blocked by other readers/writers unless you have altered the isolation level (knee-jerk tends to be NOLOCK but I'm seeing a promising number of people finally starting to believe in snapshot). If you need an absolutely transactionally consistent and accurate row count (even if it is only valid for the number of milliseconds it takes to return the number to you), then SELECT COUNT( * ) is your only choice.

On the other hand, if you are trying to get a 99.9% accurate ballpark, you are much better off with a query like this:

SELECT row_count = SUM(row_count)
  FROM sys.dm_db_partition_stats
  WHERE [object_id] = OBJECT_ID('dbo.Table')
  AND index_id IN (0,1);

(The SUM is there to account for partitioned tables - if you are not using table partitioning, you can leave it out.)

This DMV maintains accurate row counts for tables with the exception of rows that are currently participating in transactions - and those very transactions are the ones that will make your SELECT COUNT query wait. But otherwise this will lead to a much quicker answer than the query you propose, and no less accurate than using NOLOCK.

Aaron Bertrand 2010-02-08 06:27:50

Will Count(Primary_Key) would be a better option than the query above ? It should always give an accurate count since there cannot be a primary key with null values.

Aseem Gautam 2010-02-08 06:50:30

COUNT(PK) and COUNT(*) will be identical in that case. Each will give an accurate count, but can still be a blocker and can still be blocked. So whether that potential performance killer is worth the accuracy is up to you (keeping in mind that if you "solve" the problem by using NOLOCK, you've lost your accuracy anyway).

Aaron Bertrand 2010-02-08 06:56:41

I always use to think Count(PK) performs better than Count(*). Thanks for clearing this out.

Aseem Gautam 2010-02-08 06:59:09

The optimizer knows when it encounters COUNT(*) with no WHERE clause that it should go after the narrowest non-filtered index, which is hopefully the PK.

Aaron Bertrand 2010-02-08 07:07:05

polyglot 2010-02-09 12:54:07

I have yet to see a case where COUNT() returns faster than accessing the DMVs. At the very best case, they could only be equal, but in a lot of cases, the COUNT() query is going to have to go inspect pages. The DMV query will always pluck the numbers out of memory.

Aaron Bertrand 2010-02-09 13:47:46

Answer 3

+1 A:

Already answered but pretty good discussion about this here:

http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/31795984-ea62-4b2c-8c78-6e986f2bcea0

Aseem Gautam 2010-02-09 06:11:36

ansaurus

tags:

views:

answers:

Count(*) vs Count(Id) in sql server 2005

related questions