The usual alpha symbol for regular expressions \w
in the .NET Framework matches alphanumeric symbols, and thus is equivalent to [a-zA-Z0-9]
, right? There is any [a-zA-Z]
equivalent in .NET?
views:
48answers:
2
+5
A:
Not quite, \w
also matches the underscore and accented characters (ä, ó, etc.).
If you just want to match letters (including accented ones), you can use the Unicode property \p{L}
Tim Pietzcker
2009-10-29 17:01:57
more info on http://www.regular-expressions.info/unicode.html
Jader Dias
2009-10-29 17:07:56
Here's a link to all the character classes in .NET Regex in great detail: http://msdn.microsoft.com/en-us/library/20bw873z.aspx
280Z28
2009-10-29 17:09:41
A:
From the MSDN documentation:
If ECMAScript-compliant behavior is specified, \w is equivalent to [a-zA-Z_0-9]. For information on ECMAScript regular expressions, see Regular Expression Options and ECMAScript vs. Canonical Matching Behavior.
So if you use
new Regex("\w", RegexOptions.ECMAScript);
It will be equivalent to [a-zA-Z_0-9]
Jader Dias
2009-10-29 19:23:39