tags:

views:

26

answers:

4

I have a table that records a history of address updates, called transaction. The schema is something like row_id (int, PK), user_id (int), address1 (varchar), transdate (timestamp).

I want to query the table and have a single row returned for a user showing what is the latest row (i.e. greatest timestamp), but if there is data in the address1 column I want the latest row with data. If there is no data in this column then just the latest row.

Example:

row_id    user_id    address1        transdate
1         70005      56 The Street   2010-08-25 09:15
2         70005      NULL            2010-08-25 10:04
3         70005      12 Some Road    2010-08-25 11:17
4         70005      NULL            2010-08-25 12:18

With a query like

SELECT  user_id, address1 
FROM    transaction t 
WHERE   user_id = 70005 
        AND row_id = 
            (SELECT MAX(row_id) 
             FROM transaction ti 
             WHERE ti.user_id = t.user_id)

the returned result would be

user_id    address1
70005      NULL

but what I want is

user_id    address1
70005      12 Some Road

because this is the latest row for that user that has some data.

Hope this makes sense. Does anyone have any suggestions?

I am using MySQL 5.1.49 (community). Thanks.

+2  A: 

UPDATED

SELECT  user_id, address1 
FROM    transaction t 
WHERE   user_id = 70005 
    AND row_id = 
        IFNULL(
          (SELECT MAX(row_id) 
         FROM transaction ti 
         WHERE ti.user_id = t.user_id AND address1 IS NOT NULL),

         (SELECT MAX(row_id) 
         FROM transaction ti 
         WHERE ti.user_id = t.user_id ) 
        );
a1ex07
I think you missed the point. The address can be null, but he wants to prioritize not null addresses.
Kendrick
yeah, updated...
a1ex07
Kendrick - yes, exactly. If the address1 col is null tha's fine - return that only if there are no other rows with values in address1. Also, although the query will only ever be run on a single-user basis (i.e. with the condition WHERE user_id = n) I want the same query to return a value for users with all NULLs in address1.
cjashwell
I think this is exactly what I'm after. Just need to test it with a bit of real data ...
cjashwell
@a1ex07 I can't test that, but I'm always amazed at how versatile the IFNULL function is. If that works (and I don't see why it wouldn't) that's a good way of doing it.
Kendrick
Brilliant. Works exactly as I wanted. Thanks very much indeed.
cjashwell
+3  A: 
SELECT    user_id, address1 
FROM      transaction
WHERE     user_id = 70005
ORDER BY  ISNULL(address1) ASC, row_id DESC
LIMIT     1

This should prioritize rows with data, while still working when address1 is NULL. I would also use transdate instead of row_id.

NullUserException
yeah he doesn't need the subquery
vulkanino
I never thought of that, and it's a neat concept. How does the order work if update 1 makes the address "a" and update 2 makes the address "b" Won't it sort the older update first because "a" is before "b"?
Kendrick
@Kendrick, yeah, but the OP's sorting by row_id. I guess these records can never be updated and are always inserted in order.
NullUserException
@NullUserException - correct, no updates, only ordered inserts in this table.
cjashwell
@cja Does my query work?
NullUserException
@NullUserException I got that assumption, my question is what's returned from ISNULL(address1). Wouldn't that return the address if it exists, and null if it doesn't? If it does return the address, then you're sorting on the value of the address first, and ignoring the row_id value because you'll return the highest ordered address regardless of row_id.
Kendrick
@NullUserException - yes, that seems to work perfectly. Thank you very much. I'm just playing with a couple of scenarios involving other data items (as my problem does in real life) comparing your approach with a1ex07 to see which one works best for me.
cjashwell
@Kendrick `ISNULL()` returns 0 or 1; I am sorting by nullity first, then by address
NullUserException
@NullUserException Igmore me, I was confusing IsNull with IfNull...
Kendrick
OK, have tested both solutions with my actual query (which includes a lot more data items and other tables left joined in and various other bits) and the a1ex07 one works best in my particular case. I have learned a lot from both of these answers, however, so thank you very much.
cjashwell
A: 

Take a look at the accepted answer on this question (not my answer, his was better)

By doing a union of the results with address and then the results where there is no address, you can prioritize and still use a simple subquery join.

Note that you're grabbing the max row_id, not that max date, so you're not going to get the results you expect, even if you do this.

Kendrick
Note to my note, if you're positive that larger values of row_id always match larger values of transdate, then you can use row_id, but you're making an assumption that I wouldn't be comfortable making...
Kendrick
I am sure the latest row_id is what I'm after. Because the transdate in this table is only granular to the level of seconds there are rows for the same user with the same transdate.Thank you for the link, though.
cjashwell
A: 
SELECT  user_id, address1
FROM    transaction t  
WHERE   user_id = 70005 
ORDER BY case when address1 is not null then 0 else 1 end, transdate desc
limit 1
RedFilter