views:

473

answers:

5

Could you provide a regex that match Twitter usernames?

Extra bonus if a Python example is provided.

+5  A: 

If you're talking about the @username thing they use on twitter, then you can use this:

import re
twitter_username_re = re.compile(r'@([A-Za-z0-9_]+)')

To make every instance an HTML link, you could do something like this:

my_html_str = twitter_username_re.sub(lambda m: '<a href="http://twitter.com/%s"&gt;%s&lt;/a&gt;' % (m.group(1), m.group(0)), my_tweet)
icktoofay
Is there any official specification?
Juanjo Conti
No, but I know that Twitter usernames can contain alphanumerics and underscores, and if they do allow anything else, it's not commonly seen in the wild.
icktoofay
+1  A: 

The only characters accepted in the form are A-Z, 0-9, and underscore. Usernames are not case-sensitive, though, so you could use r'@(?i)[a-z0-9_]+' to match everything correctly and also discern between users.

echoback
It doesn't make much of a difference that they are not case-sensitive. `(?i)` refers to your pattern, not the value you capture. It's still up to the program to deal with ABC and Abc as the same value.
Kobi
+3  A: 

Twitter recently released to open source both java and ruby (gem) implementations of the code they use for finding user names, hash tags, lists and urls.

It is very regular expression oriented.

Evan
+1  A: 

Shorter, /@([\w]+)/ works fine.

henasraf
A: 

Here is a PHP function that links urls and also mailto and twitter usernames and arguments tags.

Pons