tags:

views:

1497

answers:

9

This is something that comes up so often I almost stopped thinking about it but I'm almost certain that I'm not doing this the best way.

The question: Suppose you have the following table

CREATE TABLE TEST_TABLE
(
  ID          INTEGER,
  TEST_VALUE  NUMBER,
  UPDATED     DATE,
  FOREIGN_KEY INTEGER
);

What is the best way to select the TEST_VALUE associated with the most recently updated row where FOREIGN_KEY = 10?

EDIT: Let's make this more interesting as the answers below simply go with my method of sorting and then selecting the top row. Not bad but for large returns the order by would kill performance. So bonus points: how to do it in a scalable manner (ie without the unnecessary order by).

A: 

The probably inferior way that I currently go about doing something like this is

SELECT TEST_VALUE
FROM TEST_TABLE
WHERE ID = (
  SELECT ID
  FROM (
    SELECT ID
    FROM TEST_TABLE
    WHERE FOREIGN_KEY = 10
    ORDER BY UPDATED DESC
  )
  WHERE ROWNUM = 1
)

but please StackOverflow Geniuses, teach me some tricks

George Mauer
A: 

Either use a sub-query

WHERE updated = (SELECT MAX(updated) ...)

or select the TOP 1 record with

ORDER BY updated DESC

In Oracle syntax this would be:

SELECT 
  * 
FROM 
(
  SELECT * FROM test_table
  ORDER BY updated DESC
)
WHERE 
  ROWNUM = 1
Tomalak
More than one record with different FOREIGN_KEY s can be updated at the same time though...
George Mauer
I'm not from the Oracle folk, as you may have guessed by my choice of syntax. :-) But the general concept of selecting the TOP 1 record should transcend syntax borders.
Tomalak
yeah, that seems to be what people are implying, but for seeriously large returns the need to order by would kill preformance
George Mauer
Unless you have an index defined, which is what I would do for a column I wanted sort performance for.
Tomalak
You can't assume there is a TOP 1 (or any equivalent) in an arbitrary version of SQL, so SELECT MAX is generally safer. If that results in large reads, then better indexing is required. A logical index would be a compound index on updated and timestamp.
le dorfier
A: 
select test_value
from
(
  select test_value 
  from test_table
  where foreign_key=10
  order by updated desc
)
where rownum = 1

Oracle is smart enough to realize it only needs a single row from the inner select and it will do this efficiently.

WW
A: 

wouldn't this work:

SELECT TOP 1 ID
FROM test_table
WHERE FOREIGN_KEY = 10
ORDER BY UPDATED DESC

no need for a subquery...

TJMonk15
No TOP clause in Oracle...
Tomalak
also you would still need a subquery to select test_value
George Mauer
Oh. Wasn't aware of that. My Apologies :)Guess I'm too used to MS SQL
TJMonk15
No TOP clause is early versions of MS SQL either. In MySQL it's "LIMIT n".
le dorfier
+4  A: 

Analytic functions are your friends

SQL> select * from test_table;

        ID TEST_VALUE UPDATED   FOREIGN_KEY
---------- ---------- --------- -----------
         1         10 12-NOV-08          10
         2         20 11-NOV-08          10

SQL> ed
Wrote file afiedt.buf

  1* select * from test_table
SQL> ed
Wrote file afiedt.buf

  1  select max( test_value ) keep (dense_rank last order by updated)
  2  from test_table
  3* where foreign_key = 10
SQL> /

MAX(TEST_VALUE)KEEP(DENSE_RANKLASTORDERBYUPDATED)
-------------------------------------------------
                                               10

You can also extend that to get the information for the entire row

SQL> ed
Wrote file afiedt.buf

  1  select max( id ) keep (dense_rank last order by updated) id,
  2         max( test_value ) keep (dense_rank last order by updated) test_value
,
  3         max( updated) keep (dense_rank last order by updated) updated
  4  from test_table
  5* where foreign_key = 10
SQL> /

        ID TEST_VALUE UPDATED
---------- ---------- ---------
         1         10 12-NOV-08

And analytic approaches are generally pretty darned efficient.

I should also point out that analytic functions are relatively new, so if you are on something earlier than 9.0.1, this may not work. That's not a huge population any more, but there are always a few folks stuck on old versions.

Justin Cave
that is some crazy querying my friend, good job
George Mauer
Wouldn't my rownum query below perform better? Agree analytics are a better generic solution.
WW
I'm not sure how analytics work really but basic CS would tell you that optimum run-time for this task the query should be O(n) where n is the number of rows matching the where. With an order by it would be O(n^2)
George Mauer
A: 

Firstly, you will always need to look at all the rows with that foreign key, and find the one with the highest UPDATED value...which means a MAX or ORDER BY. The efficiency of the comparison is partly up to the optimizer, so will depend on your Oracle version. Your data structures may have a greater impact on actual performance though. An index on FOREIGN_KEY, UPDATED DESC, TEST_VALUE would probably give the most scalable solution for querying as Oracle will normally be able to give the answer just accessing a single leaf block. There may be a detrimental impact on inserts as new records have to be inserted into that structure.

Gary
yes but Max is only O(n) whereas order by is higher
George Mauer
A: 

Performance will depend on what is indexed. Here is a method.

WITH 
ten AS
(
    SELECT *
    FROM TEST_TABLE
    WHERE FOREIGH_KEY = 10
)
SELECT TEST_VALUE 
FROM ten
WHERE UPDATED = 
(
    SELECT MAX(DATE)
    FROM ten
)
EvilTeach
A: 

There is an oracle SQL faq here that may help you:

http://www.orafaq.com/wiki/SQL_FAQ

Jason Slocomb
A: 
SELECT TEST_VALUE
  FROM TEST_TABLE
 WHERE UPDATED      = ( SELECT MAX(UPDATED)
                          FROM TEST_TABLE
                         WHERE FOREIGN_KEY = 10 )
   AND FOREIGN-KEY  = 10
   AND ROWNUM       = 1  -- Just in case records have the same UPDATED date

Rather that take the first record you could break a tie with the hightest ID or maybe least/largest TEST_VALUE.

An index of FOREIGN_KEY, UPDATED would help query performace.

Paul Morgan