ansaurus

Question

sql max/min query and data transformation

Answer 1

+3 A:

I think the only way you'll be able to do it is with sub-queries.

SELECT shipment_id
    , (SELECT TOP 1 time 
     FROM test AS [b] 
     WHERE b.shipment_id = a.shipment_id 
     AND b.stop_seq = MIN(a.stop_seq)) AS [start]
    , (SELECT TOP 1 time 
     FROM test AS [b] 
     WHERE b.shipment_id = a.shipment_id 
     AND b.stop_seq = MAX(a.stop_seq)) AS [end]
FROM test AS [a]
GROUP BY shipment_id

You'll need to use the DATEPART function to chop up the time column to get your exact output.

CptSkippy 2009-08-10 20:21:24

the times may not always be in sequential order...i forgot to show that in my example...updated my question to reflect this.

thomas 2009-08-10 20:22:54

this works. thanks. i knew it could be done with multiple selects but the purist in me was looking for alternatives.

thomas 2009-08-10 20:43:48

@thomas : If you create a clustered index on shipment_id and stop_seq the impact of doing the sub-queries will be negligible because all of the clauses are sargable and will perform index seeks.

CptSkippy 2009-08-10 21:19:49

@thomas: If you have a shipments table, see the addendum to my solution for a more efficient alternative to CptSkippy's solution, but along the same lines.

Steve Kass 2009-08-10 23:25:39

@CptSkippy: Not everyone might realize that your TOP 1 is there to avoid getting an error if the subqueries return more than one row. Personally, I'd leave TOP 1 out for two reasons. First, TOP 1 without ORDER BY is a bad pattern to use in general, because it's nondeterministic, and second, because if the subquery returns more than one row, chances are that there is a flaw in design or logic somewhere that is only being hidden.Just my 2c worth. Your solution is a good one, though, and probably as efficient as any!

Steve Kass 2009-08-10 23:31:27

@Steve: All valid points and I agree with you to an extent. I think it really boils down to the amount of control you have over the systems you're working with.

CptSkippy 2009-08-11 01:25:58

this one is going to meet my specific need. thanks!

thomas 2009-08-11 02:09:01

Answer 2

A:

Am I correct in thinking that you want the first time rather than the 'min' time, and the last time in the sequence rather than the 'max' time?

fras85 2009-08-10 20:38:46

that is correct. the time associated with the first stop (i.e. min(stop_seq)) and the time associated with the last stop (i.e. max(stop_seq))

thomas 2009-08-10 20:40:52

Answer 3

A:

SELECT C.shipment_id, C.start, B2.time AS stop FROM
(    
   SELECT A.shipment_id, B1.time AS start, A.max_stop_seq FROM
   (
      SELECT shipment_id, MIN(stop_seq) as min_stop_seq, MAX(stop_seq) as max_stop_seq 
      FROM test
      GROUP BY shipment_id
   ) AS A

   INNER JOIN 

   (
      SELECT shipment_id, stop_seq, time FROM test
   ) AS B1

   ON A.shipment_id = B1.shipment_id AND A.min_stop_seq = B1.stop_seq
) AS C

INNER JOIN

(
   SELECT shipment_id, stop_seq, time FROM test
) AS B2

ON C.shipment_id = B2.shipment_id AND C.max_stop_seq = B2.stop_seq

Richard Dunlap 2009-08-10 20:49:25

Answer 4

+1 A:

Use a Common Table Expression (CTE) - this works (at least on my SQL Server 2008 test system):

WITH SeqMinMax(SeqID, MinID, MaxID) AS
(
    SELECT Shipment_ID, MIN(stop_seq), MAX(stop_seq)
    FROM test
    GROUP BY Shipment_ID
)
SELECT 
    SeqID 'Shipment_ID',
    (SELECT TIME FROM test 
       WHERE shipment_id = smm.seqid AND stop_seq = smm.minid) 'Start',
    (SELECT TIME FROM test 
       WHERE shipment_id = smm.seqid AND stop_seq = smm.maxid) 'End'
FROM seqminmax smm

The SeqMinMax CTE selects the min and max "stop_seq" values for each "shipment_id", and the rest of the query then builds on those values to retrieve the associated times from the table "test".

CTE's are supported on SQL Server 2005 (and are a SQL:2003 standard feature - no Microsoft "invention", really).

Marc

marc_s 2009-08-10 20:53:49

Answer 5

A:

select t1.shipment_id, t1.time start, t2.time [end]
from (
    select shipment_id, min(stop_seq) min, max(stop_seq) max
    from test
    group by shipment_id
) a
inner join test t1 on a.shipment_id = t1.shipment_id and a.min = t1.stop_seq 
inner join test t2 on a.shipment_id = t2.shipment_id and a.max = t2.stop_seq

RedFilter 2009-08-10 21:29:57

Answer 6

A:

I suggest you take advantage of row_number and PIVOT. This may look messy, but I think it will perform well, and it's more adaptable to various assumptions. For example, it doesn't assume that the latest datetime value corresponds to the largest stop_seq value for a given shipment.

with test_ranked(shipment_id,stop_seq,time,rankup,rankdown) as (
  select
    shipment_id, stop_seq, time,
    row_number() over (
      partition by shipment_id
      order by stop_seq
    ),
    row_number() over (
      partition by shipment_id
      order by stop_seq desc
    )
  from test
), test_extreme_times(shipment_id,tag,time) as (
  select
    shipment_id, 'start', time
  from test_ranked where rankup = 1
  union all
  select
    shipment_id, 'end', time
  from test_ranked where rankdown = 1
)
  select
    shipment_id, [start], [end]
  from test_extreme_times
  pivot (max(time) for tag in ([start],[end])) P
  order by shipment_id;
  go

The PIVOT isn't really needed, but it's handy. However, do note that the MAX inside the PIVOT expression doesn't do anything useful. There's only one [time] value for each tag, so MIN would work just as well. The syntax requires an aggregate function in this position.

Addendum: Here's an adaptation of CptSkippy's solution that may be more efficient than using MIN and MAX if you have a shipments table:

SELECT shipment_id
    , (SELECT TOP 1 time 
        FROM test AS [b] 
        WHERE b.shipment_id = a.shipment_id 
        ORDER BY stop_seq ASC) AS [start]
    , (SELECT TOP 1 time 
        FROM test AS [b] 
        WHERE b.shipment_id = a.shipment_id 
        ORDER BY stop_seq DESC) AS [end]
FROM shipments_table AS [a];

Steve Kass 2009-08-10 23:20:48

thanks steve. this looks a little more complicated than I am looking for given the specific need and the fact that I already have a bunch of CTEs in my stored procedure and don't want to clutter it up entirely :-) I do like the row_number() with partition...i need to research that some more to understand its capabilities better. thanks for you help and suggestions!

thomas 2009-08-11 02:10:36

ansaurus

tags:

views:

answers:

sql max/min query and data transformation

related questions