ansaurus

Question

Test the sequentiality of a column with a single SQL query

Answer 1

+1 A:

Look for sequences where max - min + 1 > count:

IF EXISTS (SELECT set_ID
             FROM mytable
            GROUP BY set_ID
           HAVING MAX(n) - MIN(n) + 1 > COUNT(n)
          )
    ROLLBACK

If the sequence must start at 1, do this instead:

IF EXISTS (SELECT set_ID
             FROM mytable
            GROUP BY set_ID
           HAVING MIN(n) = 1 AND MAX(n) > COUNT(n)
          )
    ROLLBACK

You also need to avoid duplicate sequence numbers. But this can be done by creating a unique key on set_ID and n.

Marcelo Cantos 2010-05-13 11:40:29

Sorry, forgot to mention that also situations like 1-2-3-3-5 should be avoided. It must be a sequence of unique values.

LauriE 2010-05-13 11:44:20

@LauriE: I've amended the question to deal with this.

Marcelo Cantos 2010-05-13 11:56:30

Thanks! The unique key should be an even better approach than verifying it later

LauriE 2010-05-13 12:00:32

Still - I do have a column called "OK" for soft-deleting the rows, so I can't create a unique index on set_ID and n, as there might be several soft-deleted rows with a non-unique n value. However, this was not part of the original question and I believe that structure should be preferred over "testing" if possible (unique index vs. count distinct). So I'm accepting this answer.

LauriE 2010-05-13 12:07:29

the unique index will make it difficult to move rows around within the 'n' column sequence, and it does not prevent testing as it only guarantees unique values and not sequential values.

KM 2010-05-13 12:10:29

I'm not sure how you're using it, but you may be able to add the `OK` column to the unique key. However, the rollback test, which is still needed in any case, may have to be cleverer to handle this.

Marcelo Cantos 2010-05-13 12:22:26

Answer 2

A:

Try this:

IF EXISTS (SELECT set_ID
               FROM mytable
               GROUP BY set_ID
               HAVING MIN(n) = 1 AND MAX(n) <> COUNT(DISTINCT n)
          )
    ROLLBACK

works on SQL Server (I don't have MySql to try it out):

DECLARE @YourTable table (ID int, set_ID char(5), some_column char(10),n int)
INSERT @YourTable VALUES (1, 'set-1' ,'aaaaaaaaaa' ,1)
INSERT @YourTable VALUES (2, 'set-1' ,'bbbbbbbbbb' ,2)
INSERT @YourTable VALUES (3, 'set-1' ,'cccccccccc' ,3)
INSERT @YourTable VALUES (4, 'set-2' ,'dddddddddd' ,1)
INSERT @YourTable VALUES (5, 'set-2' ,'eeeeeeeeee' ,2)
INSERT @YourTable VALUES (6, 'set-3' ,'ffffffffff' ,2)
INSERT @YourTable VALUES (7, 'set-3' ,'gggggggggg' ,1)
INSERT @YourTable VALUES (8, 'set-3' ,'ffffffffff' ,4)
INSERT @YourTable VALUES (9, 'set-3' ,'ffffffffff' ,4)

--this will list all "bad" sets
SELECT set_ID
    FROM @YourTable
    GROUP BY set_ID
    HAVING MIN(n) = 1 AND MAX(n) <> COUNT(DISTINCT n)

OUTPUT:

set_ID
------
set-3

KM 2010-05-13 11:47:43

Very similar to Marcelo's solution, the COUNT DISTINCT seems to do the final trick. Perfect! Thanks,Lauri

LauriE 2010-05-13 11:55:59

ansaurus

tags:

views:

answers:

Test the sequentiality of a column with a single SQL query

related questions