views:

66

answers:

4

Hi there,

In my database I have a lot of users who've misspelled their e-mail address. This in turn causes my postfix to bounce a lot of mails when sending the newsletter.
Forms include (but are not limited to) "yaho.com", "yahho .com" etc.
Very annoying!

So i have been trying to update those record to the correct value.
After executing select email from users where email like '%@yaho%' and email not like '%yahoo%'; and getting the list, I'm stuck because I do not know how to update only the yaho part. I need the username to be left intact.

So I thought I would just dump the database and use vim to replace, but I cannot escape the @ symbol..

BTW, how do I select all email addresses written in CAPS? select upper(email) from users; would just transform everything into CAPS, whereas I just needed to find out the already-written-in-CAPS mails.

A: 

You could try using INSTR together with SUBSTR or LEFT to get the part before the "@" symbol, perhaps.

Something like SELECT LEFT("[email protected]",INSTR("[email protected]","@")-1); seems to work.

CharlesLeaf
Yes, this works too.
w00t
+1  A: 

To address your second question (about finding emails written in caps), something like this might be helpful:

select email from users where upper(email) = email

(Forgive me if the syntax is not precisely correct, since I'm used to DB2. The idea is to compare the straight email address with the upper-cased version.)

Syntactic
@Syntactic: Note that the default character set and collation in MySQL are case insensitive by default. [Source](http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html)
Daniel Vassallo
@Daniel Vassallo Good point. Learning something new every day.
Syntactic
Oh I think I understand what you mean. But instead of finding UPPER mails, it returns all lower cased mails.. about 50.000 of them :))Fair enough with the case insensitive, there is no problem with sending mails or anything else, it's just that it bothered my eyes.
w00t
@w00t: If you want to convert all the emails to lowercase, you can simply do: `UPDATE users SET email = lower(email);`
Daniel Vassallo
@Daniel: yep, already done that.
w00t
@w00t: Do it on the application side, when inserting/updating the data, and you won't have to worry about it again.
Syntactic
+1  A: 

You may want to try something like the following:

UPDATE   users
SET      email = CONCAT(LEFT(email, INSTR(email, '@')), 'yahoo.com')
WHERE    email LIKE '%@yaho.com%';

Test case:

CREATE TABLE users (email varchar(50));

INSERT INTO users VALUES ('[email protected]');
INSERT INTO users VALUES ('[email protected]');
INSERT INTO users VALUES ('[email protected]');


UPDATE   users
SET      email = CONCAT(LEFT(email, INSTR(email, '@')), 'yahoo.com')
WHERE    email LIKE '%@yaho.com%';

Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0


SELECT * FROM users;
+-----------------+
| email           |
+-----------------+
| [email protected] |
| [email protected] |
| [email protected] |
+-----------------+
3 rows in set (0.00 sec)

To answer your second question, you probably need to use a case sensitive collation such as the latin1_general_cs:

SELECT * FROM users WHERE email COLLATE latin1_general_cs = UPPER(email);

Test case:

INSERT INTO users VALUES ('[email protected]');


SELECT * FROM users;   
+-----------------+
| email           |
+-----------------+
| [email protected] |
| [email protected] |
| [email protected] |
| [email protected] |
+-----------------+
4 rows in set (0.00 sec)


SELECT * FROM users WHERE email COLLATE latin1_general_cs = UPPER(email);
+-----------------+
| email           |
+-----------------+
| [email protected] |
+-----------------+
1 row in set (0.00 sec)
Daniel Vassallo
Great example! Worked without problems. One thing though, the database is UTF8 so I cannot use latin1 collations. But not a big deal, the SELECT isn't vital, I just updated everything to lowercase.Thank you!
w00t
A: 

For the first question, I would choose something like

UPDATE users
SET email = INSERT(email,INSTR(email,'@'), LENGTH(email), '@yahoo.com')
WHERE email LIKE '%@yaho.com'

Just for the sake of being thorough, this is multi-byte safe even though I've used LENGTH. All that's needed is for the third argument of INSERT to be at least as large as the end of the substring.

Syntactic's answer of finding the all-caps email is a good answer. Possibly performing slightly faster, although you would likely not notice the difference, is

SELECT email FROM users WHERE BINARY(email) NOT REGEXP '[a-z]'

Update: BINARY(email) is needed to force case-sensitive matching.

David M
@David: I think you need to wrap the `@` in single quotes in the `INSTR()` function, otherwise it will return `NULL`. Also I guess your last query should look something like `FROM users WHERE email NOT REGEXP ...`.
Daniel Vassallo
@David M: Indeed, the `@` needed single quotes because it returned `NULL`. It worked after. The second query matches only emails that are made of numbers or +--_ but no alphabetical ones.
w00t
Thanks for the corrections!
David M
@David M: great, it works!
w00t