ansaurus

Question

Select three rows, two of them (grouped) unique, other filtered by another column (SQL)

Answer 1

+2 A:

This risks duplicates in the event of duplicate gen_timestamp values:

 SELECT x.domain_name, 
        x.index_path, 
        x.collection_name
   FROM TABLENAMEHERE x
   JOIN (SELECT t.domain_name,
                t.index_path,
                MAX(t.gen_timestamp) AS max_ts
           FROM YOUR_TABLE t
       GROUP BY t.domain_name, t.index_path) y ON y.domain_name = x.domain_name
                                              AND y.index_path = x.index_path
                                              AND y.max_ts = x.gen_timestamp
ORDER BY domain_name, index_path

Using ROW_NUMBER (9i+), no risk of duplicates:

WITH summary AS (
  SELECT t.domain_name,
         t.index_path,
         t.collection_name,
         ROW_NUMBER() OVER(PARTITION BY t.domain_name,
                                        t.index_path
                               ORDER BY t.gen_timestamp DESC) AS rank
    FROM YOUR_TABLE t)
  SELECT s.domain_name,
         s.index_path,
         s.collection_name
    FROM summary s
   WHERE s.rank = 1
ORDER BY domain_name, index_path

OMG Ponies 2010-09-14 21:01:48

that selects the actual time stamp while I want to select the collection name that the timestamp refers to. Something like what I just edited to question.

Jacob Nelson 2010-09-14 21:09:38

@jacobnlsn: So you want the `collection_name` value associated with the highest `gen_timestamp` per domain/path pair--correct?

OMG Ponies 2010-09-14 21:12:36

@OMG Ponies: I want the collection_name, domain_name and index_path values associated with he highest gen_timestamp per domain/path pair. So you were very close.

Jacob Nelson 2010-09-14 21:14:49

@jacobnlsn: Understood, updated answer.

OMG Ponies 2010-09-14 21:19:37

@OMG Ponies: You are a hero, the 2nd query works amazingly!

Jacob Nelson 2010-09-14 21:29:02

Answer 2

A:

select distinct domain_name, 
                index_path, 
                first(collection_name) over (partition by domain_name, index_path order by gen_timestamp desc) 
from Your_Table

Allan 2010-09-14 21:13:03

Pretty sure you need PARTITION BY in the analytic, or it'll just be the first collection_name with the highest timestamp value...

OMG Ponies 2010-09-14 21:30:36

@OMG Ponies: You're right, of course.

Allan 2010-09-15 20:42:01

Answer 3

+1 A:

There is an aggregate function available since version 9 that does exactly what you are asking for. Unfortunately I haven't seen this one mentioned in the responses in your two threads yet.

A table to demonstrate your problem:

SQL> create table tablenamehere (domain_name,index_path,collection_name,gen_timestamp)
  2  as
  3  select 'A', 'Z', 'a collection name', systimestamp from dual union all
  4  select 'A', 'Z', 'b collection name', systimestamp - 1 from dual union all
  5  select 'A', 'Y', 'c collection name', systimestamp from dual union all
  6  select 'B', 'X', 'd collection name', systimestamp - 2 from dual union all
  7  select 'B', 'X', 'e collection name', systimestamp - 4 from dual union all
  8  select 'B', 'X', 'f collection name', systimestamp from dual
  9  /

Table created.

And your query which shows min(collection_name). This is showing "d collection name", but you want it to show "f collection name":

SQL> SELECT domain_name, index_path, MIN(collection_name) collection_name
  2  FROM TABLENAMEHERE
  3  GROUP BY domain_name, index_path
  4  /

D I COLLECTION_NAME
- - -----------------
A Y c collection name
A Z a collection name
B X d collection name

3 rows selected.

No need to apply analytic functions to all your rows and filter on those results: you are doing an aggregation and the LAST function does your job exactly. Here is a link to the documentation: http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions071.htm#sthref1495

SQL> select domain_name
  2       , index_path
  3       , max(collection_name) keep (dense_rank last order by gen_timestamp) collection_name
  4    from tablenamehere
  5   group by domain_name
  6       , index_path
  7  /

D I COLLECTION_NAME
- - -----------------
A Y c collection name
A Z a collection name
B X f collection name

3 rows selected.

Regards, Rob.

Rob van Wijk 2010-09-15 08:44:33

ansaurus

tags:

views:

answers:

Select three rows, two of them (grouped) unique, other filtered by another column (SQL)

related questions