I have 3 sql tables:
Data36 (Data_ID:int <PK>, type:int),
Data38(Data_ID:int <PK>, clientId:int),
Data47(Data_ID:int <PK>, payerID:int).
I thought the following queries are identical, because I don't use aggregate functions here and GROUP BY should behave the same way as DISTINCT. But they return very different result sets and I don't understand why. Please help me to understand defference between these queries.
Query 1 (returns 153 rows):
SELECT payer.Data_ID, payer.type
FROM Data36 AS payer
JOIN Data38 AS serv ON payer.Data_ID = serv.clientId
WHERE ((SELECT count(*) FROM Data47 AS regsites WHERE regsites.payerID = payer.Data_ID) = 0)
GROUP BY payer.Data_ID, payer.type
Query 2 (returns 4744 rows):
SELECT DISTINCT payer.Data_ID, payer.type
FROM Data36 AS payer
JOIN Data38 AS serv ON payer.Data_ID = serv.clientId
WHERE ((SELECT count(*) FROM Data47 AS regsites WHERE regsites.payerID = payer.Data_ID) = 0)
SQL Server version is 5.0.40.
Let me know if you need more specific information.
Update: Sorry for not mentioning this: Data_ID is a Primary Key in these tables, so Data_ID is unique for each record in these tables.
SELECT count( * ) FROM Data36
--returns 5998
SELECT count(DISTINCT Data_ID) FROM Data36
--returns 5998
Update 2: In Query 1 I changed 'GROUP BY payer.Data_ID' to 'GROUP BY payer.Data_ID, payer.type'. The result is still the same - 153 rows.