views:

516

answers:

6

I have the following table and data in SQL Server 2005:

create table LogEntries (
  ID int identity,
  LogEntry varchar(100)
)

insert into LogEntries values ('beans')
insert into LogEntries values ('beans')
insert into LogEntries values ('beans')
insert into LogEntries values ('cabbage')
insert into LogEntries values ('cabbage')
insert into LogEntries values ('beans')
insert into LogEntries values ('beans')

I would like to group repeated LogEntries so that I have the following results:

LogEntry  EntryCount
beans     3
cabbage   2
beans     2

Can you think of any way to do this in TSQL outside of using a cursor?

A: 

SQL not exactly my strong point but won't

SELECT LogEntry, COUNT(1) AS Counter FROM LogEntries GROUP BY LogEntry

do it?

Paul
Nope this would result inbeans - 5Cabbage - 2
Dheer
Ah, fell it to the bear pit too. Oh well. At least I'm in good company :)
Paul
A: 

I don't think that you can do this with one query. In order to provide counts in a query, you need to group using the LogEntry column. However, this will just give you total counts per LogEntry, and not the counts of the numbers of entries in sequence that you are looking for. I think that a cursor is called for (or bring the whole dataset to your application, and use logic there to get the results that you want).

Yaakov Ellis
A: 

unless my brain has yet to boot up this morning

SELECT LogEntry, COUNT(LogEntry) as EntryCount FROM LogEntries GROUP BY LogEntry

kpollock
ah b*gger, fell into the same trap as Russ Cam...
kpollock
Nope this would result inbeans - 5Cabbage - 2
Dheer
worse, someone picked me up on it before I finished typing my retraction!!! :-)
kpollock
A: 

Now I have looked at the eactual question closely enough :-)

Hmm, on reconsidering, why not just use a cursor? The performance isn't always worse than straight SQL - and it'd certianly be easy for other people to follow the code when they come to look at it. Wrap it in a stored proc or function and you'd be able to use it pretty much anywhere you might need.

kpollock
+2  A: 

This is a set-based solution for the problem. The performance will probably suck, but it works :)

CREATE TABLE #LogEntries (
  ID INT IDENTITY,
  LogEntry VARCHAR(100)
)

INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('cabbage')
INSERT INTO #LogEntries VALUES ('cabbage')
INSERT INTO #LogEntries VALUES ('carrots')
INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('beans')
INSERT INTO #LogEntries VALUES ('carrots')

SELECT logentry, COUNT(*) FROM (
    SELECT logentry, 
    ISNULL((SELECT MAX(id) FROM #logentries l2 WHERE l1.logentry<>l2.logentry AND l2.id < l1.id), 0) AS id
    FROM #LogEntries l1
) AS a
GROUP BY logentry, id


DROP TABLE #logentries

Results:

beans   3
cabbage 2
carrots 1
beans   2
carrots 1

The ISNULL() is required for the first set of beans.

Jonas Lincoln
Brilliant! Thanks, this is a lot simpler than the cursor I wrote.
Lance Fisher
I am curious, how is the performance vs a cursor?
kpollock
I am going senile too, as now I look at it, I have used this sort of set based solution myself in the past.
kpollock
+3  A: 

I think this will do it... didn't check too thoroughly though

select 
    COUNT(*),subq.LogEntry 
from 
(
    select 
     ROW_NUMBER() OVER(ORDER BY id)-ROW_NUMBER() OVER(PARTITION BY logentry ORDER BY id) as t,*
    from 
     LogEntries
) subq 
group by 
    subq.t,subq.LogEntry 
order by 
    MIN(subq.ID)
spender
Thanks, it does work. I haven't used the partition keyword before. That's interesting.
Lance Fisher
This is a very elegant and cunning use ov the OVER() function. Nice!
Jonas Lincoln
NEAT! I don't know partitionng well enough... should do!
kpollock