views:

177

answers:

1

So, I'm having some problems with a MySQL query (see other question), and decided to try a different approach.

I have a database table with some duplicate rows, which I actually might need for future reference, so I don't want to remove. What I'm looking for is a way to display the data without those duplicates, but without removing them. I can't use a simple select query (as described in the other question).

So what I need to do is write a code that does the following: 1. Go through my db Table. 2. Spot duplicates in the "ip" column. 3. Mark the first instance of each duplicate with "0" (in a column named "duplicate") and the rest with "1".

This way I can later SELECT only the rows WHERE duplicate=0.

NOTE: If your solution is related to the SELECT query, please read this other question first - there's a reason I'm not just using GROUP BY / DISTINCT.

Thanks in advance.

+1  A: 

MySQL doesn't have any ranking/analytical/windowing functionality, but you can use a variable instead:

  SELECT t.ip,
         CASE 
           WHEN @ip != t.ip THEN @rank := 0 
           ELSE @rank := @rank + 1 
         END AS duplicate,
         @ip = t.ip
    FROM TABLE t
    JOIN (SELECT @rank := 0, @ip = NULL) r
ORDER BY t.ip

The first occurrence of the ip value will be have the value of zero in the duplicate column; all subsequent records will have value incrementing by one. If you don't want the incrementing number, use:

  SELECT t.ip,
         CASE 
           WHEN @ip != t.ip THEN 0 
           ELSE 1 
         END AS duplicate,
         @ip = t.ip
    FROM TABLE t
    JOIN (SELECT @ip = NULL) r
ORDER BY t.ip

You can get a list of unique IP rows from that by using it in a subquery:

SELECT x.ip
  FROM (paste either query in here) x
 WHERE x.duplicate = 0
OMG Ponies