tags:

views:

89

answers:

3

I am trying to create a regular expression in C# that allows only alphanumeric characters and spaces. Currently, I am trying the following:

string pattern = @"^\w+$";
Regex regex = new Regex(pattern);
if (regex.IsMatch(value) == false)
{
  // Display error
}

What am I doing wrong?

+1  A: 

Try this Regex: "^[0-9A-Za-z ]+$"

The brackets specify a set of characters

0-9: All digits

A-Z: All capital letters

a-z: All lowercase letters

' ': Spaces

Do you need non A-Z letters checked?

John JJ Curtis
+2  A: 

The character class \w does not match spaces. Try replacing it with [\w ] (there's a space after the \w to match word characters and spaces. You could also replace the space with \s if you want to match any whitespace.

Michael Petito
following up on a comment from Michael under my answer, which I think is more suited here: be aware that `\w` matches more than just letters. It also matches punctuation (dashes, dots, comma's etc): http://msdn.microsoft.com/en-us/library/6w3ahtyy.aspx and http://msdn.microsoft.com/en-us/library/20bw873z%28v=VS.100%29.aspx#WordCharacter
Abel
+1  A: 

If, other then 0-9, a-z and A-Z, you also need to cover any accented letters like ï, é, æ, Ć or Ş then you should better use the Unicode properties \p{...} for matching, i.e. (note the space):

string pattern = @"^[\p{IsLetter}\p{IsDigit} ]+$";
Abel
Actually, unless you specify `ECMAScript`, `\w` uses those Unicode properties and more: http://msdn.microsoft.com/en-us/library/20bw873z(v=VS.100).aspx#WordCharacter
Michael Petito
Maybe I misunderstand, but the ECMAScript modifier does not influence the behavior of Unicode Properties. It does influence the `\w` behavior. And you're right: `\w` matches far more then just letters, it also matches punctuation like comma's, colons, accents. The `\p{IsLetter} is far more precise and only matches letters: http://msdn.microsoft.com/en-us/library/yyxz6h5w.aspx
Abel
Yes I think we're saying the same thing ;-) It all depends on what the OP needs. I was simply pointing out that `\w` matches more than `[0-9a-zA-Z]`. It also includes accented letters for example (unless `ECMAScript` is specified).
Michael Petito