tags:

views:

2029

answers:

2

Hi,

I am currently trying to construct a somewhat tricky MySQL Select Statement. Here is what I am trying to accomplish:

I have a table like this:

data_table

uniqueID      stringID          subject
  1    144           "My Subject"
  2    144     "My Subject - New"
  3    144     "My Subject - Newest"
  4    211     "Some other column"

Bascially, what I'd like to do is be able to SELECT/GROUP BY the stringID (picture that the stringID is threaded) and not have it duplicated. Furthermore, I'd like to SELECT the most recent stringID row, (which in the example above is uniqueID 3).

Therefore, if I were to query the database, it would return the following (with the most recent uniqueID at the top):

uniqueID   stringID    subject
 4          211        "Some other column"  
 3          144        "My Subject - Newest" //Notice this is the most recent and distinct stringID row, with the proper subject column.

I hope this makes sense. Thank you for you help.

+4  A: 

Try the following. It might not be the most efficient query, but it will work:

SELECT uniqueID, stringID, subject
FROM data_table
WHERE uniqueID IN
 (
  SELECT MAX(uniqueID) 
  FROM data_table
  GROUP BY stringID
 )
ORDER BY uniqueID DESC
Andrew Moore
This query helped the most. Also, I replaced the 'uniqueID' with lexu's suggestion above, using the timestamp. Thanks very much for your help.
A: 

Edit: Based on new info provided by the OP in a comment, this would be preferable to relying on uniqueID:

select t.uniqueID
       , t.stringID
       , t.subject
       , t.your_timestamp_col
from   data_table t
       left outer join data_table t2
       on t.stringID = t2.stringID
    and
       t2.your_timestamp_col > t.your_timestamp_col
where  t2.uniqueID is null

If, as lexu mentions in a comment, you are certain that the highest uniqueID value always corresponds with the newest subject, you could do this:

select t.uniqueID
       , t.stringID
       , t.subject
from   data_table t
       left outer join data_table t2
       on t.stringID = t2.stringID
    and
       t2.uniqueID > t.uniqueID
where  t2.uniqueID is null

Which basically means: return to me only those records from data_table where there exists no higher uniqueID value.

Adam Bernier
It will actually perform worse. The subquery does not use any of the superqueries columns, and therefore, is computed only once. A `max` is much quicker than trying to compare each id one by one. Moreover, the join will then have to apply the `where` clause. The subquery, however, will create a hash table which serves as a lookup to each of the ID's. Ergo, only one comparison, and we don't have to check the column after all the comparisons are done.
Eric