ansaurus

Question

Grouping runs of data

Answer 1

A:

For this particular case, all you need to do is group by the name and ask for the count, like this:

select Name, count(*)
from MyTable
group by Name

That'll get you the count for each name as a second column.

You can get it all as one column by concatenating like this:

select Name + ' (' + cast(count(*) as varchar) + ')'
from MyTable
group by Name

Matt Hamilton 2008-08-22 00:16:00

Answer 2

+2 A:

@Matt: that groups all the rows with a particular Name. The OP wanted to group consequtive rows, I'm not aware of a way to do it in SQL.

Nickolay 2008-08-22 00:20:01

Answer 3

A:

@Nickolay Ah you're right - now that he has reformatted his question I can see the data more clearly.

Yes, I'm not sure of a pure SQL way to do this either. You'd have to do it with a cursor, I think.

Matt Hamilton 2008-08-22 00:20:56

Answer 4

+1 A:

Well, this:

select Name, count(Id)
from MyTable
group by Name

will give you this:

Harry Johns, 1
Adam Taylor, 1
John Smith, 2
Bill Manning, 1

and this (MS SQL syntax):

select Name +
    case when ( count(Id) > 1 ) 
         then ' ('+cast(count(Id) as varchar)+')' 
         else ''
    end
from MyTable
group by Name

will give you this:

Harry Johns
Adam Taylor
John Smith (2)
Bill Manning

Did you actually want that other John Smith on the end of your results?

EDIT: Oh I see, you want consecutive runs grouped. In that case, I'd say you need a cursor or to do it in your program code.

Blorgbeard 2008-08-22 00:25:27

Answer 5

+2 A:

Try this:

select n.name, 
    (select count(*) 
     from myTable n1
     where n1.name = n.name and n1.id >= n.id and (n1.id <=
        (
        select isnull(min(nn.id), (select max(id) + 1 from myTable))
        from myTable nn
        where nn.id > n.id and nn.name <> n.name
        )
     ))
from myTable n
where not exists (
   select 1
   from myTable n3
   where n3.name = n.name and n3.id < n.id and n3.id > (
            select isnull(max(n4.id), (select min(id) - 1 from myTable))
            from myTable n4
            where n4.id < n.id and n4.name <> n.name
            )
)

I think that'll do what you want. Bit of a kludge though.

Phew! After a few edits I think I have all the edge cases sorted out.

Matt Hamilton 2008-08-22 00:57:08

Answer 6

+2 A:

I hate cursors with a passion... but here's a dodgy cursor version...

Declare @NewName Varchar(50)
Declare @OldName Varchar(50)
Declare @CountNum int
Set @CountNum = 0

DECLARE nameCursor CURSOR FOR 
SELECT Name
FROM NameTest
OPEN nameCursor

FETCH NEXT FROM nameCursor INTO @NewName

  WHILE @@FETCH_STATUS = 0 

    BEGIN

      if @OldName <> @NewName
      BEGIN
         Print @OldName + ' (' + Cast(@CountNum  as Varchar(50)) + ')'
         Set @CountNum = 0
      END
      SELECT @OldName = @NewName
      FETCH NEXT FROM nameCursor INTO @NewName
      Set @CountNum = @CountNum + 1

    END
Print @OldName + ' (' + Cast(@CountNum  as Varchar(50)) + ')'

CLOSE nameCursor
DEALLOCATE nameCursor

Leon Bambrick 2008-08-22 01:08:59

Answer 7

+1 A:

How about this:

declare @tmp table (Id int, Nm varchar(50));

insert @tmp select 1, 'Harry Johns';
insert @tmp select 2, 'Adam Taylor';
insert @tmp select 3, 'John Smith';
insert @tmp select 4, 'John Smith';
insert @tmp select 5, 'Bill Manning';
insert @tmp select 6, 'John Smith';

select * from @tmp order by Id;

select Nm, count(1) from 
(
select Id, Nm, 
    case when exists (
     select 1 from @tmp t2 
     where t2.Nm=t1.Nm 
     and (t2.Id = t1.Id + 1 or t2.Id = t1.Id - 1)) 
     then 1 else 0 end as Run
from @tmp t1
) truns group by Nm, Run

[Edit] That can be shortened a bit

select Nm, count(1) from (select Id, Nm, case when exists (
     select 1 from @tmp t2 where t2.Nm=t1.Nm 
     and abs(t2.Id-t1.Id)=1) then 1 else 0 end as Run
from @tmp t1) t group by Nm, Run

Eric Z Beard 2008-08-22 01:17:34

Answer 8

+2 A:

My solution just for kicks (this was a fun exercise), no cursors, no iterations, but i do have a helper field

-- Setup test table
DECLARE @names TABLE    (
                        id      INT                 IDENTITY(1,1),
                        name    NVARCHAR(25)        NOT NULL,
                        grp     UNIQUEIDENTIFIER    NULL
                        )

INSERT @names (name)
SELECT 'Harry Johns'    UNION ALL 
SELECT 'Adam Taylor'    UNION ALL
SELECT 'John Smith'     UNION ALL
SELECT 'John Smith'     UNION ALL
SELECT 'Bill Manning'   UNION ALL
SELECT 'Bill Manning'   UNION ALL
SELECT 'Bill Manning'   UNION ALL
SELECT 'John Smith'     UNION ALL
SELECT 'Bill Manning'   

-- Set the first id's group to a newid()
UPDATE      n
SET         grp = newid()
FROM        @names n
WHERE       n.id = (SELECT MIN(id) FROM @names)

-- Set the group to a newid() if the name does not equal the previous
UPDATE      n
SET         grp = newid()
FROM        @names n
INNER JOIN  @names b
        ON  (n.ID - 1) = b.ID
        AND ISNULL(b.Name, '') <> n.Name

-- Set groups that are null to the previous group
-- Keep on doing this until all groups have been set
WHILE (EXISTS(SELECT 1 FROM @names WHERE grp IS NULL))
BEGIN
    UPDATE      n
    SET         grp = b.grp
    FROM        @names n
    INNER JOIN  @names b
            ON  (n.ID - 1) = b.ID
            AND n.grp IS NULL
END

-- Final output
SELECT      MIN(id)     AS id_start,
            MAX(id)     AS id_end,
            name,
            count(1)    AS consecutive
FROM        @names
GROUP BY    grp, 
            name
ORDER BY    id_start

/*
Results:

id_start    id_end  name            consecutive
1           1     Harry Johns     1
2           2     Adam Taylor     1
3           4     John Smith     2
5           7     Bill Manning    3
8           8     John Smith     1
9           9     Bill Manning    1
*/

Jon Erickson 2008-08-22 04:22:17

ansaurus

tags:

views:

answers:

Grouping runs of data

related questions