tags:

views:

48

answers:

2

The usual alpha symbol for regular expressions \w in the .NET Framework matches alphanumeric symbols, and thus is equivalent to [a-zA-Z0-9], right? There is any [a-zA-Z] equivalent in .NET?

+5  A: 

Not quite, \w also matches the underscore and accented characters (ä, ó, etc.).

If you just want to match letters (including accented ones), you can use the Unicode property \p{L}

Tim Pietzcker
more info on http://www.regular-expressions.info/unicode.html
Jader Dias
Here's a link to all the character classes in .NET Regex in great detail: http://msdn.microsoft.com/en-us/library/20bw873z.aspx
280Z28
A: 

From the MSDN documentation:

If ECMAScript-compliant behavior is specified, \w is equivalent to [a-zA-Z_0-9]. For information on ECMAScript regular expressions, see Regular Expression Options and ECMAScript vs. Canonical Matching Behavior.

So if you use

new Regex("\w", RegexOptions.ECMAScript);

It will be equivalent to [a-zA-Z_0-9]

Jader Dias