tags:

views:

249

answers:

7

I want to take all the records from my MySQL table and check if there are duplicates. I had the idea of storing them all in an array and then checking the array for duplicates. The problem is, I have about 1.5 million rows in my MySQL table.

This is my code so far:

<?php

$con = mysql_connect('localhost', 'root', '');
$sel = mysql_select_db('usraccts', $con);

$users = array();

$q = "SELECT usrname FROM `users`";
$r = mysql_query($q, $con);

while($row = mysql_fetch_assoc($r))
{
 $users[] = $row['usrname'];
}

print_r($emails);

?>

I'm not sure how I can adapt this to check for duplicates in the array entries, especially with 1.5 million of them :|

Thanks for any help.

A: 
$q = "SELECT count(*),usrname FROM `users` group by usrname having count(*)>1";
Y. Shoham
A: 

A few comments:

One, is you can use a DISTINCT keyword in your SQL to return originals only (no dupes)

Two, why are you inserting duplicates in the db in the first place? You might want to fix that.

Three, you could select all the rows (not a good idea) and just stick them in the array like your doing, except make this change:

$users[$row['username']] = $row['username'];

No dupes in that logic! heh

Mr-sk
Thanks, I'm actually fixing up a friends website and he has his users seperated by their IDs, and they can have the same username. hence why I'm trying to fix.
Matt
Cool, good luck Matt!
Mr-sk
A: 

You could use the group by mysql function to find out, emails exist twice or more. This is very heavy load on the mysql server though.

SELECT usrname, count(*)
FROM `users`
GROUP BY `email`
HAVING count(*) > 1;
FlorianH
A: 

You can do it in MYSQL with something like

SELECT usrname, COUNT(usrname) as duplicates FROM `users` WHERE duplicates > 1 GROUP BY usrname

Obviously all of the usrname returned have duplicates

danielrsmith
+1  A: 

$q = "SELECT distinct usrname FROM users";

With this query you get all unique usernames.

hey
+1  A: 

Maybe you could try a SQL query like:

SELECT usrname, 
COUNT(usrname) AS NumOccurrences
FROM users
GROUP BY usrname
HAVING ( COUNT(usrname) > 1 )

this should return all users that exist more than once.

Gary Willoughby
A: 

array_unique() will return only the unique array values. In all honesty, I wouldn't delegate this task to PHP, I'd handle it during my query to the database.

Jonathan Sampson