ansaurus

Question

Advanced grouping without using a sub query

Answer 1

A:

Options available in general include:

Store the illustrated data in a temp table, then query the temp table.
Use a WITH clause to define the complex query, then have the DBMS sort out the query.

The WITH clause effectively allows you to give a name to a sub-query; the optimizer will avoid re-evaluating it if at all possible. The TEMP table solution is likely to be the simplest. And that will do GROUP BY of ID and MIN(rank) and join back.

Jonathan Leffler 2009-06-09 05:55:22

Answer 2

+1 A:

select t1.id
       , t1.rank
       , t1.type
       , t1.status
       , t1.amount

from   my_table t1 

       left outer join my_table as t2 
       on t1.id = t2.id 
    and 
       t2.rank < t1.rank 

where  t2.id is null

Adam Bernier 2009-06-09 05:57:02

In this case what is in t2?

vdh_ant 2009-06-09 06:15:41

@Anthony: the join to t2 is also known as a self-join. It is another copy of the same table. The reason it works is that we specify in the predicate (the join conditions and the WHERE clause) that we want to exclude everything but the top-ranked item for each id.

Adam Bernier 2009-06-09 06:28:06

@adam: The problem is that to get the data out of my_table is very expensive (i.e. between 2 to 6 seconds) hence I would like to avoid joining onto the table again...

vdh_ant 2009-06-09 06:52:02

What I have done (even though I really don't want to), I have put the results into a temp table and then joining the temp table onto itself...

vdh_ant 2009-06-09 07:00:17

Well, that may be the way to go. Have a look at what another SO user has to say about decomposing your complicated queries into steps: http://stackoverflow.com/questions/754527/best-way-to-test-sql-queries/754570#754570

Adam Bernier 2009-06-09 07:04:00

Have you tried Alex's solution yet? http://stackoverflow.com/questions/968305/advanced-grouping-without-using-a-sub-query/968399#968399

Adam Bernier 2009-06-09 07:04:57

Yes but the performance difference was very little. But since it simplifies the query I might use it.

vdh_ant 2009-06-10 01:53:00

Answer 3

+1 A:

SELECT * FROM TheTable
WHERE 1 = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Rank DESC)

Alex Martelli 2009-06-09 06:00:18

This was my first instinct. Maybe the OP will post some timing results.

Adam Bernier 2009-06-09 06:02:05

Just for the record, in my situation this cases an error - "Windowed functions can only appear in the SELECT or ORDER BY clauses." So I have had to put the over part in a sub query and the where part in the outer query.

vdh_ant 2009-06-10 01:56:40

Answer 4

+1 A:

This will work:

with temp as (
select *, row_number() over (partition by id order by rank) as rownum
from table_name
)
select * from temp where rownum = 1

Will give one record per id where rank represents the least number

Rashmi Pandit 2009-06-09 09:06:01

Answer 5

A:

Why is getting the data set so expensive, I see nothing terribly complex here. Do you have the indexes you need, is the query using them? Are the statistics out of date?

HLGEM 2009-06-09 20:25:37

For the purposes of the question I have simplified the scenario. Basically the table is a Table_valued Function which unions results from 2 other Table_valued Functions each which use about 6 temp tables to build up the results. This is due to the level of normalization that is present in the database and how much data need to be derived to build up a picture of the data. Really this data should be captured in a Materialize View or something similar. But I can't make any change like this in this release cycle. cheers

vdh_ant 2009-06-10 01:35:17

ansaurus

tags:

views:

answers:

Advanced grouping without using a sub query

related questions