views:

79

answers:

4

I have a query:

select  event_type, 
    department_name,
    effective_time,
    row_number() OVER (PARTITION BY event_type,department_name ORDER BY effective_time) row
from    a
order by effective_time

It returns a row set:

event_type  department_name effective_time
3   A 02/10/09 13:12:00
3   B 02/10/09 15:44:00
3   B 02/10/09 20:36:00
7   C 04/01/09 00:01:00
7   D 04/10/09 00:01:00
7   D 04/20/09 00:01:00
7   E 04/20/09 00:01:00
7   F 04/23/09 09:32:00
7   F 05/15/09 12:21:00
7   G 05/15/09 12:21:00
7   H 05/15/09 12:21:00
1   H 07/28/09 08:51:00
1   G 07/28/09 08:51:00
1   F 07/28/09 10:40:00
1   F 07/28/09 12:34:00
1   H 07/28/09 12:34:00
1   G 07/28/09 12:34:00
1   D 07/29/09 10:45:00
1   D 07/29/09 12:48:00
1   G 07/31/09 13:47:00
1   F 07/31/09 13:47:00
1   D 08/03/09 00:01:00
3   B 08/03/09 10:39:00

I need the row set to look like:

event_type  department_name effective_time
3   A 02/10/09 13:12:00
3   B 02/10/09 15:44:00
7   C 04/01/09 00:01:00
7   D 04/10/09 00:01:00
7   E 04/20/09 00:01:00
7   F 04/23/09 09:32:00
7   G 05/15/09 12:21:00
7   H 05/15/09 12:21:00
1   H 07/28/09 08:51:00
1   G 07/28/09 08:51:00
1   F 07/28/09 10:40:00
1   H 07/28/09 12:34:00
1   G 07/28/09 12:34:00
1   D 07/29/09 10:45:00
1   G 07/31/09 13:47:00
1   F 07/31/09 13:47:00
1   D 08/03/09 00:01:00
3   B 08/03/09 10:39:00

Essentially, removing the second (or more) occurrence of a given event_type and department_name in the group.

I was hoping to use the row_number to solve this problem, by eliminating all row # > 1.

Unfortunately, as written, the row_number() function fails to reset the row counter after a change in event_type and department_name.

Questions:

  1. Can the row_number() calculation be adjusted?
  2. Is there another approach that would work more efficiently?
  3. Can this be done without programmatic intervention (i.e. stored procedure or UDF)?

Thanks for your assistance.

A: 

You can do a group by event_type, department_name. The only thing is date you will have to add that to an aggregate like min(date) or the group by does not makes sense.

select  event_type, 
    department_name,
    convert(varchar, effective_time, 1) as date

from    a
group by event_type, department_name, convert(varchar, effective_time, 1)
order by effective_time
CSharpAtl
Doesn't work--this gives me a *unique* list of event_type/department_name values. If you look closely at the desired results, the event_type/department_name combination can occur multiple times, but with different times.
Craig
What are you wanting then?
CSharpAtl
what do the times mean then?
CSharpAtl
looks like you want to group on day too....?
CSharpAtl
added a group by DATE part of the Datetime column as well
CSharpAtl
This should give you 1 row for the day...
CSharpAtl
Unfortunately, I need the effective_time to be accurate to the seconds.
Craig
but what time is it? If you have multiple times in a day, you only want 1 datetime, but if the combination of event/department happen at a later datetime you want that in the result set?
CSharpAtl
+2  A: 

Try this:

SELECT event_type, 
    department_name,
    MIN(effective_time)
FROM    a
GROUP BY event_type, 
    department_name
ORDER BY effective_time
Chris McCall
+1 Dang, why didn't I think of that :)
Andomar
@Chris: I think it should be Min instead of Max.
shahkalpesh
@shahkalpesh: edited
Chris McCall
No. See my response to CSharpAtl for the reason why.
Craig
+2  A: 

Right, after reading your comment I think I understand. One approach would be to number the rows in a subquery based an effective_time. With the numbers it's easy to search for the previous row. Then you can filter out "repeat" rows by saying each row must be different from its predecessor.

Here's an example query:

;with numbered as (
    SELECT event_type, department_name, effective_time,
           row_number() OVER (ORDER BY effective_time) row
    FROM a
)
SELECT    cur.event_type, cur.department_name, cur.effective_time
FROM      numbered cur
LEFT JOIN numbered prev ON cur.row = prev.row + 1
WHERE     cur.row = 1
          or prev.event_type <> cur.event_type
          or prev.department_name <> cur.department_name
ORDER BY  cur.effective_time

By the way, if you like tested answers, post the example data as text and not as a jpg image :)

Andomar
this is true, but i still have the issue with the row_number calculation.
Craig
@Craig: Ok, I think I understand the question better now, edited.
Andomar
*very* close. unfortunately, the query loses the first row (3,A,02/10/09 13:12:00).
Craig
@Craig: that's an easy fix, just add an OR clause for "prev.event_type is null" or "cur.row = 1"
Andomar
that added the first row, thanks.i noticed, however, that the (1,G,07/31/09 13:47:00) is missing from the result set. this omission was present before the 'first row' fix.thoughts? and thanks!
Craig
@Craig: You're welcome, interesting question! No clue on the missing row; neither way of adding the first row should affect other rows
Andomar
A: 
select  event_type, 
    department_name,
    min(effective_time) as effTime,
from    a
group by event_type, department_name
order by effective_time

Does this help at all?

shahkalpesh
No. See my response to CSharpAtl for the reason why.
Craig