ansaurus

Question

Answer 1

+1 A:

Not the best solution, but it should work (for example, 3 consecutive ids):

SELECT Id, EmployeeID FROM
(
SELECT r.Id, r.EmployeeID, 
(SELECT COUNT(1) FROM recs r1 WHERE (r1.EmployeeID = r.EmployeeID AND r1.id = r.Id-1) AS c1,
(SELECT COUNT(1) FROM recs r2 WHERE (r2.EmployeeID = r.EmployeeID AND r2.id = r.Id-2) AS c2,
(SELECT COUNT(1) FROM recs r3 WHERE (r3.EmployeeID = r.EmployeeID AND r3.id = r.Id-3) AS c3
FROM recs r1) tab1
WHERE (tab1.c1+tab1.c2+tab1.c3 =3);

I suggested that Id is a primary(or a unique) key. If it's not, you should change a little each of sub-queries to something like SELECT IF(COUNT(1) >0,1,0) .....

a1ex07 2010-02-22 23:41:24

Answer 2

+2 A:

Join to the same table where table1.Id = table2.Id + 1 and table1.employeeid = table2.employeeid

Gabriel McAdams 2010-02-22 23:42:50

This is the first step, but I still need to get blocks of data with at least 5 consecutive IDs. Your solution will fetch all consecutive rows.

Anax 2010-02-23 00:47:40

Answer 3

A:

Use a temp table for this. Use this solution:

SELECT EmployeeID, MIN(Id) AS Min, MAX(Id) AS Max, COUNT(*) AS Count
INTO #TempTable
FROM recs
GROUP BY EmployeeID

SELECT * FROM #TempTable WHERE
Count > 5 AND
       Max - Min + 1 = Count

EDITED ANSWER

please try this:

SELECT * FROM(    
SELECT EmployeeID, MIN(Id) AS min, MAX(Id) AS max, COUNT(*) AS count
    FROM recs
    GROUP BY EmployeeID) AS Table
    WHERE Table.count > 5 AND
           Table.max - Table.min + 1 = Table.count

masoud ramezani 2010-02-23 04:08:41

I believe this will work exactly as the query I provided. It will only fetch blocks of data whenever an employee appears on a single block.

Anax 2010-02-23 08:04:20

please see edited answer.

masoud ramezani 2010-02-23 08:30:18

This still won't work. Try it on the provided data set (replace Table.count > 5 with Table.count >= 2) to see it for yourself. You're still approaching the problem in the same way.

Anax 2010-02-23 12:43:27

Answer 4

A:

Wow, this was a real brain teaser. I'm sure this has all kinds of holes but here's a possible solution. First our test data:

If Exists(Select 1 From INFORMATION_SCHEMA.TABLES Where TABLE_NAME = 'recs')
    DROP TABLE recs
GO
Create Table recs
(
    Id int not null
    , EmployeeId int not null
)
Insert recs(Id, EmployeeId) 
Values (1,1) ,(2,1) ,(3,1) ,(4,2) ,(5,5) ,(6,1) ,(7,1) ,(8,1) ,(10,1)   
    ,(11,1) ,(12,1) ,(13,2) ,(14,2) ,(15,2) ,(16,2)

Next, you will need a Tally or Numbers table that contains a sequence of numbers. I only put 500 elements in this one, but given the size of the data you may want more. The largest number in the Tally table should be bigger than the largest Id in the recs table.

Create Table dbo.Tally(Num int not null)
GO
;With Numbers As
    (
    Select ROW_NUMBER() OVER ( ORDER BY s1.object_id) As Num
    From sys.columns as s1
    )
Insert dbo.Tally(Num)
Select Num
From Numbers
Where Num < 500

Now for the actual solution. Basically, I used a series of CTEs to deduce the start and end point of the consecutive sequences.

; With 
    Employees As 
    (
    Select Distinct EmployeeId 
    From dbo.Recs
    )
    , SequenceGaps As
    (
    Select E.EmployeeId, T.Num, R1.Id 
    From dbo.Tally As T
        Cross Join Employees As E
        Left Join dbo.recs As R1
            On R1.EmployeeId = E.EmployeeId
                And R1.Id = T.Num
    Where T.Num <= (    
        Select Max(R3.Id) 
        From dbo.Recs As R3
            Where R3.EmployeeId = E.EmployeeId
            )
    )
    , EndIds As
    (
    Select S.EmployeeId
        , Case When S1.Id Is Null Then S.Id End As [End]
    From SequenceGaps As S
        Join SequenceGaps As S1
            On S1.EmployeeId = S.EmployeeId
                And S1.Num = (S.Num + 1) 
    Where S.Id Is Not Null
        And S1.Id Is Null
    Union All
    Select S.EmployeeId, Max( Id )
    From SequenceGaps As S
    Where S.Id Is Not Null
    Group By S.EmployeeId
    )
    , SequencedEndIds As
    (
    Select EmployeeId, [End]
        , ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY [End]) As SequenceNum
    From EndIds
    )
    , StartIds As
    (
    Select S.EmployeeId
        , Case When S1.Id Is Null Then S.Id End As [Start]
    From SequenceGaps As S
        Join SequenceGaps As S1
            On S1.EmployeeId = S.EmployeeId
                And S1.Num = (S.Num - 1)
    Where S.Id Is Not Null
        And S1.Id Is Null
    Union All
    Select S.EmployeeId, 1 
    From SequenceGaps As S
    Where S.Id = 1
    )
    , SequencedStartIds As
    (
    Select EmployeeId, [Start]
        , ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY [Start]) As SequenceNum
    From StartIds
    )
    , SequenceRanges As
    (
    Select S1.EmployeeId, Start, [End]
    From SequencedStartIds As S1
        Join SequencedEndIds As S2
            On S2.EmployeeId = S1.EmployeeId
                And S2.SequenceNum = S1.SequenceNum
    )
Select *
From SequenceGaps As SG
Where Exists(
        Select 1
        From SequenceRanges As SR
        Where SR.EmployeeId = SG.EmployeeId
            And SG.Id Between SR.Start And SR.[End]
            And ( SR.[End] - SR.[Start] + 1 ) >= @SequenceSize
        )

Using the final statement in the WHERE clause and @SequenceSize, you can control which sequences are returned.

Thomas 2010-02-23 16:32:46

ansaurus

tags:

views:

answers:

Help me find blocks of data

related questions