views:

61

answers:

7

I have a table of about 1.6M rows where the records have a unique (identity) ID and then also have a 8-character "code" field. The code field it used to group data into sets...all rows with the same code are in the same set. Each set should consist of 12 records, but it seems our data load was off and some sets are incomplete.

I need to identify the incomplete sets. How would I write a query to select out only those records that are part of a set that do NOT have the proper number of records in their set (12)?

I am using MS SQL 2008.

Thanks.

+1  A: 

This will give you the individual records that need to be updated:

   SELECT *
    FROM MyTable
    WHERE CodeField IN
    (
        SELECT CodeField,
        COUNT(CodeField) AS [Count]
        FROM MyTable
        GROUP BY CodeField
        HAVING COUNT(CodeField) <> 12
    ) 
flayto
Error - Too many fields in the IN clause. You have one being compared to two - MySQL supports tuples but I haven't tried it on SQL Server. This would also run over the same table twice, when it can be done in a single pass - the subquery is really all you need.
OMG Ponies
+4  A: 

Try this one...

SELECT 
 CodeField, 
 COUNT(CodeField) 
FROM table 
GROUP BY CodeField 
HAVING COUNT(CodeField) < 12
Yves M.
+1: You beat me by 16 seconds
OMG Ponies
I wasn't expecting to many answers so quickly. Thanks all! Worked well.
Bryan Lewis
+4  A: 

Use this to get codes with less than 12 records in the table:

  SELECT yt.code
    FROM YOUR_TABLE yt
GROUP BY yt.code
  HAVING COUNT(*) < 12

For this exercise, you have to use the HAVING clause - the WHERE clause doesn't allow you to use aggregate functions (MIN, MAX, COUNT...) without them being in a subquery, but the HAVING clause doesn't have that restriction.

To get codes with both less or more than 12 records in the table, use:

  SELECT yt.code
    FROM YOUR_TABLE yt
GROUP BY yt.code
  HAVING COUNT(*) != 12

Nitpick: != is ANSI-92, vs <>

OMG Ponies
Bah! 14 seconds!
LittleBobbyTables
rofl. well still vote you up mate
Yves M.
@LittleBobbyTables: You're getting closer - almost too close :p
OMG Ponies
@OMG: you get points for the != factor. Because it did turn out that we had some sets with too many items also. :)
Bryan Lewis
+1  A: 

This is a perfect case for using the having clause:

select codefield, count(*) from YOURTABLE
group by codefield
having count(*) <> 12

The HAVING clause allows you to restrict the results of a query that includes a GROUP BY to just those records that match a certain condition. You can think of it as a variation on a WHERE clause that works with grouped result sets.

A query can include both a WHERE and a HAVING clause - however, realize that the predicate in the WHERE clause is evaluated before results are grouped, while the predicate in the HAVING clause is evaluated after.

LBushkin
+1  A: 

Note that I have used <> rather than <. This will identify codes that may have too many items as well as codes that have too few.

select code, count(*) as count
from MyTable
group by code
having count(*) <> 12
RedFilter
A: 

This should work:

  select CodeField, count(*)
    from yourtable
    group by CodeField
    having count(*) <> 12
Mladen Prajdic
A: 

Hi,

You could accomplish this with a group by. Something like the following:

select 
  groups.code,
  groups.code_count
 from
    (select 
      code,
      count(code) code_count
    from
     records_table
    group by code)  groups
where 
  groups.code_count != 12

Enjoy!

Doug