views:

185

answers:

4

So I have a users table where the user.username has many duplicates like:

username and Username and useRnAme
john and John and jOhn

That was a bug and these three records should have been only one.

I'm trying to come up with a SQL query that lists all of these cases ordered by their creation date, so ideally the result should be something like this:

username jan01
useRnAme jan02
Username jan03
john     feb01 
John     feb02
jOhn     feb03

Any suggestions will be much appreciated

A: 

Use ToLower() or equivalent function in your SELECT, and order by that column.

David Lively
That will include usernames that do not suffer from the multi-entry problem.
Larry Lustig
+7  A: 

Leaving aside the issue of case sensitivity for a moment, the basic strategy is:

 SELECT username, create_date FROM your_table
     WHERE username IN 
     (SELECT username FROM your_table GROUP BY username HAVING COUNT(*) > 1)
 ORDER BY username, create_date

Many RDBMSes (including MySQL assuming that you are using CHAR or VARCHAR for the username column), perform case-insensitive searching by default. For those databases, the above solution will work. To solve the case sensitivity issue for other products , wrap all except the first occurrence of username in the uppercase conversion function specific to your RDBMS:

 SELECT username, create_date FROM your_table
     WHERE UPPER(username) IN 
     (SELECT UPPER(username) FROM your_table GROUP BY UPPER(username) HAVING COUNT(*) > 1)
 ORDER BY username, create_date
Larry Lustig
If it's for MYSQL the UPPER is not needed and might even make the query unnecessarily slow.
Mark Byers
Yes, that's true (and true for various other RDBMSes as well). I'll modify the answer to reflect that.
Larry Lustig
OK +1 for the update.
Mark Byers
is there a way to make sure the dates are in ascending order for each group of dups?
hdx
@hdx: This answer already does that. Have you tested it?
Mark Byers
yeah, didn't work...
hdx
Did you include the ORDER BY clause, with the columns appropriately changed for your database?
Larry Lustig
Ok so I found the problem... we need to have "UPPER(username), create_date" at very the end. Thx for the help!
hdx
Not sure I understand why that's so, but I'm glad you found the solution.
Larry Lustig
A: 

In MySQL, a case-sensitive compare is done using a binary collation. So you could join the table on itself, looking for rows where the case sensitive compare is different from the case insensitive compare:

select *
from YourTable t1
inner join YourTable t2 
on t1.name <> t2.name collate latin1_bin
and t1.name = t2.name
Andomar
A: 

Try something like these

SELECT UserName, CreatedDate
FROM User
WHERE LOWER(TRIM(UserName)) IN 
(
SELECT LOWER(TRIM(UserName))
FROM User
GROUP BY LOWER(TRIM(UserName))
HAVING count(*) > 1
)
Christopherous 5000
Opps, I see Larry posted the same thing first
Christopherous 5000