views:

59

answers:

5

I have the following regex which suppose to match email addresses:

[a-z0-9!#$%&'*+\\-/=?^_`{|}~][a-z0-9!#$%&'*+\\-/=?^_`{|}~.]{0,63}@[a-z0-9][a-z0-9\\-]*[a-z0-9](\\.[a-z0-9][a-z0-9\\-]*[a-z0-9])+$.

I have the following code in AS3:

var mails:Array = str.toLowerCase().match(pattern);

(pattern is RegExp with the mentioned regular expression).

I retrieve two results, when str is [email protected]:

  1. [email protected]
  2. .com

Why?

+1  A: 

I'm not sure about your regex, there is a good tutorial about email validation here.

To me this reads:

[a-z0-9!#$%&'*+\-/=?^_{|}~]           # single of chosen character set
[a-z0-9!#$%&'*+\\-/=?^_{|}~.]{0,63}   # any of chosen character set with the addition of , \
@
[a-z0-9]                              # single alpha numeric
[a-z0-9\-]*                           # any alphanumeric with the addition of -
a-z                                   # single alphabetical
0-9+                                  # at least one number
$                                     # end of line
.                                     # any character

As to why you get two sub-strings in your array, its because both match the pattern - see docs

Jon Freedman
Now my answer looks a bit daft as the regex in the question has been changed... This is what it was before honest! :)
Jon Freedman
+1  A: 

[email protected] is the match of the whole regular expression and .com is the last match of the first group ((\\.[a-z0-9][a-z0-9\\-]*[a-z0-9])).

Gumbo
+2  A: 

.com was captured by the last part of the regex (\\.[a-z0-9][a-z0-9\\-]*[a-z0-9]).

Regular expressions capture substrings matched by portions of the pattern that are enclosed in () for later use.

For example, the regex 0x([0-9a-fA-F]) will match a hexadecimal number of the form 0x9F34 and capture the hex portion in a separate group.

Amarghosh
A: 
([a-z0-9!#$%&'*+\\-/=?^_`{|}~][a-z0-9!#$%&'*+\\-/=?^_`{|}~.]{0,63}@[a-z0-9\\-]*[a-z0-9]+\\.([a-z0-9\\-]*[a-z0-9]))+$

This seem to work as expected (tested in Regex Tester). Last capturing group removed.

alxx
A: 

To add to what others have said:

There are two results because it matches both the whole email address, and the last group surrounded by parentheses.

If you don't want a group to be captured you can add ?: to the beginning of the group. Look in the AS documentation for non-capturing groups:

http://www.adobe.com/livedocs/flash/9.0/main/wwhelp/wwhimpl/js/html/wwhelp.htm?href=00000118.html#wp129703

"A noncapturing group is one that is used for grouping only; it is not "collected," and it does not match numbered backreferences. Use (?: and ) to define noncapturing groups, as follows:

var pattern = /(?:com|org|net)/;"

Mike Houston