tags:

views:

160

answers:

6

Hey Guru's, I am looking for some help on creating a regular expression that would work with a unique input in our system. We already have some logic in our keypress event that will only allow digits, and will allow the letter A and the letter M. Now I need to come up with a RegEx that can match the input during the onblur event to ensure the format is correct.

I have some examples below of what would be valid. The letter A represents an age, so it is always followed by up to 3 digits. The letter M can only occur at the end of the string.

Examples:

Valid Input

1-M
10-M
100-M
5-7
5-20
5-100
10-20
10-100
A5-7
A10-7
A100-7
A10-20
A5-A7
A10-A20
A10-A100
A100-A102

Invalid Input
a-a
a45
4

Thanks for you help Kevin

+1  A: 
^A?\d+-(?:A?\d+|M)$

An optional A followed by one or more digits, a dash, and either another optional A and some digits or an M. The '(?: ... )' notation is a Perl 'non-capturing' set of parentheses around the alternatives; it means there will be no '$1' after the regex matches. Clearly, if you wanted to capture the various bits and pieces, you could - and would - do so, and the non-capturing clause might not be relevant any more.

(You could replace the '+' with '{1,3}' as JasonV did to limit the numbers to 3 digits.)

Jonathan Leffler
You forgot the '?' quantifier after the second A.
Michael Carman
Yup - thanks (fixed).
Jonathan Leffler
+2  A: 
/^[A]?[0-9]{1,3}-[A]?[0-9]{1,3}[M]?$/

Matches anything of the form:

A(optional)[1-3 numbers]-A(optional)[1-3 numbers]M(optional)
JasonV
You don't need the character class square brackets around the A or M letters.
Jonathan Leffler
This wouldn’t match “A5-7”.
Gumbo
@Jonathan - I know, but I tend to do that out of habit. It's one of my many "coding practices"@Gumbo - I don't see why not... enlighten me?
JasonV
Because your second range says exactly '{3}' instead of '{1,3}'.
Jonathan Leffler
Thanks for the feedback. Just edited.
JasonV
+1  A: 
^A?\d{1,3}-(M|A?\d{1,3})$

^ -- the match must be done from the beginning
A? -- "A" is optional
\d{1,3} -- between one and 3 digits; [0-9]{1,3} also work
- -- A "-" character
(...|...) -- Either one of the two expressions
(M|...) -- Either "M" or...
(...|A?\d{1,3}) -- "A" followed by at least one and at most three digits
$ -- the match should be done to the end

Some consequences of changing the format. If you do not put "^" at the beginning, the match may ignore an invalid beginning. For example, "MAAMA0-M" would be matched at "A0-M".

If, likewise, you leave $ out, the match may ignore an invalid trail. For example, "A0-MMMMAAMAM" would match "A0-M".

Using \d is usually preferred, as is \w for alphanumerics, \s for spaces, \D for non-digit, \W for non-alphanumeric or \S for non-space. But you must be careful that \d is not being treated as an escape sequence. You might need to write it \\d instead.

{x,y} means the last match must occur between x and y times.

? means the last match must occur once or not at all.

When using (), it is treated as one match. (ABC)? will match ABC or nothing at all.

Daniel
A: 
Gumbo
+3  A: 

This matches all of the samples.

/A?\d{1,3}-A?\d{0,3}M?/

Not sure if 10-A10M should or shouldn't be legal or even if M can appear with numbers. If it M is only there without numbers:

/A?\d{1,3}-(A?\d{1,3}|M)/
jmucchiello
In your second version, the digits only alternative is superfluous (also covered by the optional-A-plus-digits alternative).
Jonathan Leffler
True, I'll remove it.
jmucchiello
This would also match A123-
PatrikAkerstrand
No it won't. There must be a digit or an M at a minimum on the right.
jmucchiello
+2  A: 

Use the brute force method if you have a small amount of well defined patterns so you don't get bad corner-case matches:

^(\d+-M|\d+-\d+|A\d+-\d+|A\d+-A\d+)$

Here are the individual regexes broken out:

\d+-M      <- matches anything like '1-M'
\d+-\d+    <- 5-7
A\d+-\d+   <- A5-7
A\d+-A\d+  <- A10-A20
Anton