tags:

views:

769

answers:

5

I'm trying to validate a drivers license for a form that i am making. I was trying to use a single regex.

  1. Max length 9 characters
  2. Alphanumeric characters only
  3. Must have at least 4 numeric characters
  4. Must have no more than 2 alphabetic characters
  5. The third and fourth character must be numeric

I'm new at regex I'm googling trying to work it out. Help with this would be appreciated.

+1  A: 

this sounds a bit like homework. in that spirit, one tool that is useful when you are starting out is a web-based regex parser like this one

akf
question. for example when i enter /d into regex in vs2008 it saysunrecognised escape character, any way to correct this
North
Yes use \\d instead
Arkain
In C#, it is a good idea to use the verbatim string syntax for regex - i.e. @"\d" rather than "\\d".
Marc Gravell
+1  A: 

Trying to solve this with just one regex is probably a little hard as you need to keep track of multiple things. I'd suggest you try validating each of the properties separately (unless it makes sense to do otherwise).

For example you can verify the first and second properties easily by checking for a character class including all alphanumeric characters and a quantifier which tells that it should be present at most 9 times:

^[0-9a-zA-Z]{4,9}$

The anchors ^ and $ ensure that this will, in fact, match the entire string and not just a part of it. As Marc Gravell pointed out the string "aaaaa" will match the expression "a{3}" because it can match partially as well.

Checking the fifth property can be done similarly, although this time we don't care about the rest:

^..[0-9]{2}

Here the ^ character is an anchor for the start of the string, the dot (.) is a placeholder for an arbitrary character and we're checking for the third and fourth character being numeric again with a character class and a repetition quantifier.

Properties three and four are probably easiest validated by iterating through the string and keeping counters as you go along.

EDIT: Marc Gravell has a very nice solution for those two cases with regular expressions as well. Didn't think of those.

If you absolutely need to do this in one regular expression this will be a bit work (and probably neither faster nor more readable). Basically I'd start with enumerating all possible options such a string could look like. I am using a here as placeholder for an alphabetic characters and 1 as a placeholder for a number.

We need at least four characters (3) and the third and fourth are always fixed as numbers. For four-character strings this leaves us with only one option:

1111

Five-character strings may introduce a letter, though, with different placements:

a1111
1a111
1111a

and, as before, the all-numeric variant:

11111

Going on like this you can probably create special rules for each case (basically I'd divide this into "no letter", "one letter" and "two letters" and enumerate the different patterns for that. You can then string together all patterns with the pipe (|) character which is used as an alternative in regular expressions.

Joey
with this approach, putting it all together, minus step 4, would be a good place to start. ensuring the limit of two letters could be done after.
akf
I suspect you'd need ^ and $ for the length test to handle the max length case... "aaaaa" matches "a{3}", for example.
Marc Gravell
Eep, you're right ... correcting
Joey
+4  A: 

Does it have to be a single regex? I'd keep things simple by keeping them separate:

static bool IsValid(string input)
{
    return Regex.IsMatch(input, @"^[A-Za-z0-9]{4,9}$") // length and alphanumeric
        && Regex.IsMatch(input, "^..[0-9]{2}") // 3rd+4th are numeric
        && Regex.IsMatch(input, "(.*[0-9]){4}") // at least 4 numeric
        && !Regex.IsMatch(input, "(.*[A-Za-z]){3}"); // no more than 2 alpha
}
Marc Gravell
That was my thoughts exactly. It's easier to read, easier to understand, easier to maintain.
Ian Boyd
A: 

Agree with akf
Also check out this site as it has some nice clear explanations of regex concepts

zebrabox
A: 

Try this, a very good free online regex test tool:

http://www.gskinner.com/RegExr/