tags:

views:

580

answers:

8

Why can't we use count(distinct *) in SQL? As in to count all distinct rows?

A: 

You can indeed.

If you've got an identifier, though, you won't have any entirely distinct rows. But you could do for instance:

SELECT COUNT(DISTINCT SenderID) FROM Messages
David Hedlund
+4  A: 

You can select all the columns in your table and group by...

SELECT column1, column2, column3, count(*)
FROM someTable
GROUP BY column1, column2, column3
Jason Punyon
This doesn't actually get what the question asks though.
Tom H.
+2  A: 

why not?

select 
  count(distinct name)
from 
  people
silent
Because two poeple could have the same name. The OP asked about COUNT(DISTINCT *).
Maximilian Mayerl
Sorry, it was another text in the question when I answered.
silent
+9  A: 
select count(*) from (select distinct * from MyTable) as T

Although I strongly suggest that you re-think any queries that use DISTINCT. In a large percentage of cases, GROUP BY is more appropriate (and faster).

EDIT: Having read the question comments, I should point out that you should never ask the DBMS to do more work than actually needs doing to get a result. If you know in advance that there will not be any duplicated rows in a table, then don't use DISTINCT.

Christian Hayter
@CHristian - hi, never seen it done that way before. out of curiosity ran it with an existing table on my end and i'm getting 'Incorrect syntax near ')' '. work in ms sql server? note - my select within the brackets runs perfectly
Kamal
On Oracle DISTINCT and GROUP BY have the same execution plan because DISTINCT in implemented using GROUP BY. So there should be no difference.
Mr. Shiny and New
@Kamal: Sorry, forgot that SQL Server is a bit iffy with nested queries. Adding an alias name onto the end (`as T`) solved the problem.
Christian Hayter
@Christian - cheers :)
Kamal
@MrShiny: It's all about requesting the least amount of work to be done. Conceptually speaking, `GROUP BY` is asking for less work than `DISTINCT`, and will therefore *sometimes*, not always, result in a more efficient plan. See also the difference between `JOIN` and `EXISTS`.
Christian Hayter
A: 

UberKludge, and may be postgre specific, but

select count( distinct table::text ) from table
Richo
A: 

You can try a CTE in Sql Server 2005

;WITH cte AS (
     SELECT DISTINCT Val1,Val2, Val3
     FROM @Table
)
SELECT  COUNT(1)
FROM    cte

To answer the question, From the documentation

Specifies that all rows should be counted to return the total number of rows in a table. COUNT() takes no parameters and cannot be used with DISTINCT. COUNT() does not require an expression parameter because, by definition, it does not use information about any particular column. COUNT(*) returns the number of rows in a specified table without getting rid of duplicates. It counts each row separately. This includes rows that contain null values.

astander
When you say, "From the documentation" which documentation would that be?
Tom H.
Sql Server Help Documentation. http://msdn.microsoft.com/en-us/library/ms175997.aspx
astander
A: 

COUNT(*) is the number of rows matching a query.

A row contains unique information such as rowid. All rows are by definition distinct.

You must count the distinct instances of values in some field instead.

Will
Why does a row contain unique information? It doesn't have to... it probably should but its not *required*.
Murph
A: 

some languajes may not be able to handle 'distinct *' so, if you want the distinction made through many columns you might want to use 'distinct ColumnA || ColumnB' , combining the values before judging if they are different. Be mindful whether your variables are numeric and your database handler can make automatic typecast to character strings.

Jose Antonio Padros
This also isn't a full proof method as-is. For example, ('test', 'string') and ('tes', 't string') would look the same. You could do something with padding the strings, but it gets messy. Better IMO to just use a subquery with DISTINCT and then get a COUNT from that.
Tom H.