tags:

views:

580

answers:

3

Hello!

I have the following example tweet:

RT @user1: who are @thing and @user2?

I only want to have user1, thing and user2.

What regular expression can I use to extract those three names?

I hope you can help me.

Thanks in advance!

PS: A username must only contain letters, numbers and underscores.

+5  A: 

Tested:

/@([a-z0-9_]+)/i


In Ruby (irb):

>> "RT @user1: who are @thing and @user2?".scan(/@([a-z0-9_]+)/i)
=> [["user1"], ["thing"], ["user2"]]

In Python:

>>> import re
>>> re.findall("@([a-z0-9_]+)", "RT @user1: who are @thing and @user2?", re.I)
['user1', 'thing', 'user2']

In PHP:

<?PHP
$matches = array();
preg_match_all(
    "/@([a-z0-9_]+)/i",
    "RT @user1: who are @thing and @user2?",
    $matches);

print_r($matches[1]);
?>

Array
(
    [0] => user1
    [1] => thing
    [2] => user2
)
Stefan Gehrig
You'll have to add a capture group around the [a-z0-9_], i.e. @([a-zA-Z0-9_]+)
Martin C.
Thanks, it works fine! One last question: When there must be a space before the "@" or it must be at the beginning, can I use the following expression? "/( |^)@([a-z0-9_]+)/i"
Use lookbehind - http://www.regular-expressions.info/lookaround.html
You could also use a word-boundary \b => /\b@([a-z0-9_]+)/i
Stefan Gehrig
A: 

This should do it (I used named captures for convenience):

.+?@(?[a-zA-Z0-9_]+):[^@]+?@(?[^\s]+)[^@]+?@(?[a-zA-Z0-9_]+)

Colin Cochrane
PHP shows an error message when I use your expression. Something like "missing delimiter . at the end" or so.
A: 

try an iterator (findall) with this regex:

(@[\w-]+)

bye