tags:

views:

63

answers:

3

Hello

I have a table like this

userid  visitorid   time
1       10          2009-12-23
1       18          2009-12-06
1       18          2009-12-14
1       18          2009-12-18
1705    1678        2010-01-24
1705    1699        2010-01-24
1705    1700        2010-01-24
1712    1           2010-01-25
1712    640         2010-01-24
1712    925         2010-01-25
1712    1600        2010-01-24
1712    1630        2010-01-25
1712    1630        2010-01-24
1713    1           2010-01-24
1713    1           2010-01-23

I would like to perform a query such that it removes all the duplicates except for the latest one. I Hope you get an idea?

Example, after the query the table must be like this

userid  visitorid   time
1       10          2009-12-23
1       18          2009-12-18
1705    1678        2010-01-24
1705    1699        2010-01-24
1705    1700        2010-01-24
1712    1           2010-01-25
1712    640         2010-01-24
1712    925         2010-01-25
1712    1600        2010-01-24
1712    1630        2010-01-25
1713    1           2010-01-24
A: 

Assuming your table is called Visitors:

DELETE v1.* FROM Visitors v1
LEFT JOIN (
    SELECT userid, visitorid, MAX(time) AS time
    FROM Visitors v2
    GROUP BY userid, visitorid
) v3 ON v1.userid=v3.userid AND v1.visitorid=v3.visitorid AND v1.time = v3.time
WHERE v3.userid IS NULL;
soulmerge
A: 
DELETE  mo.*
FROM    (
        SELECT  userid, visitorid, MAX(time) AS mtime
        FROM    mytable
        GROUP BY
                userid, visitorid
        ) mi
JOIN    mytable mo
ON      mo.userid = mi.userid
        AND mo.visitorid = mo.visitorid
        AND mo.time < mi.mtime
Quassnoi
thanks but this deletes all rows except only one... i.e for the latest row for each user remains.
atif089
+2  A: 
Delete from YourTable VersionA
  where VersionA.Time NOT IN
    ( select MAX( VersionB.Time ) Time
         from YourTable VersionB
         where VersionA.UserID = VersionB.UserID
           and VersionA.VisitorID = VersionB.VisitorID )

Syntax might need to be adjusted, but SHOULD do the trick. Additionally, you may want to pre-query the Subselect into its own table FIRST, then run the DELETE FROM against that result set.

DRapp
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'VersionA where VersionA.Time NOT IN ( select MAX( VersionB.Time ) Time ' at line 1
atif089
is it possible to get unique rows neglecting the time field then delete all other rows except those and use the max time field ?
atif089
"Currently, you cannot delete from a table and select from the same table in a subquery." http://dev.mysql.com/doc/refman/5.0/en/delete.html - This is because the table is not locked correctly and no-one has implemented the code to lock it correctly yet.
Mark Byers
Since I at not at a MySQL machine, I couldn't test, but the principle is correct... even if based on my subsequent comment to pre-query into its own table, then delete based on join to THAT resultset. As for the Time, it should have been MAX(VersionB.Time) AS Time (or case-sensitive to match your construct)
DRapp
lol dont get furious friend, actually that query went over my head lol so I posted the error directly, tried to fix it but didnt work.
atif089
Sorry, didn't mean for the tone to sound ticked-off, just fact, no offense.
DRapp