tags:

views:

262

answers:

3

Hey guys, I'm trying to come up with a regex that helps me validate a Blood Group field - which should accept only A[+-], B[+-], AB[+-] and O[+-].

Here's the regex I came up with (and tested using Regex Tester):

[A|B|AB|O][\+|\-]

Now this pattern successfully matches A,B,O[+-] but fails against AB[+-].

Can anyone please suggest a regex that'll serve my purpose?

Thanks, m^e

+14  A: 

Try:

(A|B|AB|O)[+-]

Using square brackets defines a character class, which can only be a single character. The parentheses create a grouping which allows it to do what you want. You also don't need to escape the +- in the character class, as they don't have their regexy meaning inside of it.

As you mentioned in the comments, if it is a string you want to match against that has the exact values you are looking for, you might want to do this:

^(A|B|AB|O)[+-]$

Without the start of string and end of string anchors, things like "helloAB+asdads" would match.

Paolo Bergantino
Leaving the second brackets now also allows "AB|".
Svante
Ah, yes. Fixed.
Paolo Bergantino
+1 but if the OP wants to use it as a validator you probably should add \A and \z to your regex. The regex becomes \A(A|B|AB|O)[+-]\z
Lieven
@Paolo Bergantino: Thanks! Just tested it with http://www.gskinner.com/RegExr/ and it seems to validate just fine... I'm going to use it as a custom validator in the jQuery Validation plug-in and all regexes for the validator seem to adhere to the syntax/^regex$/. If I don't use the preceding /^ and following $/, the method fails to validate. So I converted your regex to /^(A|B|AB|O)[+-]$/ - but that doesn't validate my fields either! How to go about this?
miCRoSCoPiC_eaRthLinG
Please disregard my last comment :D That was a silly assumption. Got it working just fine without the ^ and $. Thanks to all of you :)
miCRoSCoPiC_eaRthLinG
+1  A: 

The brackets [] denote a character class, meaning "any of the characters herein". You want the parentheses () for grouping:

(A|B|AB|0)(\+|-)
Svante
+1  A: 

When you are building an alternation (e.g. (A|B|AB|O)), you should be careful with the ordering of the elements. Many regex engines will stop at the first alternate that matches (rather than the longest). If it weren't for the [-+] forcing a backtrack, (A|B|AB|O)[-+] would not work for "AB+". It is probably better to say (AB|A|B|O)[-+] (but you should check the docs for your regex engine).

Also, if you do not intend to capture the antigen for latter use, you should you use the non-capturing grouping parentheses: (?:AB|A|B|O)[-+].

Furthermore, if you want to ensure that the only thing in the string is a blood type then you need anchors to prevent it from matching only part of the string: ^(?:AB|A|B|O)[-+]$. A quick note on anchors, Depending on your regex engine, ^ may match the beginning of a line rather than the beginning of the string if you pass it a multiline-match option. Similarly, $ may match the end of a line rather than the end of a string. For this reason there are three other anchors in common (but not %100) usage: \A, \Z, and \z. If your regex engine supports them, \A always matches the start of the string, \Z matches the end of the string or a newline just before the end of the string, and \z matches only the send of the string.

Chas. Owens
Thank you Chas. That clarified certain other aspects about this regex :)
miCRoSCoPiC_eaRthLinG