tags:

views:

22

answers:

3

I've had a good look around but havnt been able to find a solution so hoping someone can help with this one.

I have a MySQL table of results from an internal logging application which records a result from a check routine, there are a number of check routines which are identified with the tracker column:

id (int)(PK), tracker (int), time (timestamp), result (int)

A result only needs to be recorded if the previous result is not the same, only changes need to be captured. Unfortunatly this was ignored when it was built (in a hurry) a month ago and results have been recorded blindly with no checks on previous results. This has now been recorded but I'm still left with a few thousand rows of which a significant number are duplicate entries and I'm after a way of clearing these out to just leave the change points.

So I need to go through each row, look at the previous result recorded by that tracker and delete the row if its the same, this is a bit beyond my experience with MySQL and the attempts I've made so far have been fairly poor!

Can anyone help?

A: 

There are complaints that this is slow to execute, but that probably doesn't affect you. It will certainly be faster than anything else you might do:

select DISTINCT id, tracker, time, result
from table;
wallyk
I need to record any and all changes to the results so this wouldnt work unfortunatly.
Duncan
+1  A: 

Use:

   DELETE a
     FROM YOUR_TABLE a
LEFT JOIN (SELECT MAX(t.id) AS latest_id
             FROM YOUR_TABLE t
         GROUP BY t.tracker, t.result) b ON b.latest_id = a.id
    WHERE b.latest_id IS NULL

Alternate using IN:

DELETE FROM YOUR_TABLE
 WHERE id NOT IN (SELECT x.latest_id
                   FROM (SELECT MAX(t.id) AS latest_id
                          FROM YOUR_TABLE t
                      GROUP BY t.tracker, t.result) x )
OMG Ponies
Top one did what I was after, reduced number of rows by about 90% which should certainly make queries and backups quicker. Many thanks!
Duncan
why do you need the wrapper subquery x in the second suggested query?
MattSmith
@MattSmith: Without the wrapper, you'll get a MySQL #1093 error about referencing the table that is mutating.
OMG Ponies
A: 

I think you want a unique index on the table:

ALTER IGNORE TABLE table ADD UNIQUE INDEX (tracker, time, result)

http://dev.mysql.com/doc/refman/5.1/en/alter-table.html

You'll have to use INSERT IGNORE... when adding new rows as inserts that would duplicate an existing (tracker, time, result) key will cause an error.

Joshua Martell
It's not explicitly stated, but the OP does read as though that's intended... but you can't apply the constraint until the data satisfies it. And the OP states they want to DELETE the duplicates...
OMG Ponies
I had thoughts along these lines for future recording but a result captured a few minutes after another will still produce a new row even if the result is the same as the time is different surely?
Duncan