tags:

views:

257

answers:

4

I have two regular expressions that I use to validate Colorado driver's license formats.

[0-9]{2}[-][0-9]{3}[-][0-9]{4}

and

[0-9]{9}

We have to allow for only 9 digits but the user is free to enter it in as 123456789 or 12-345-6789.

Is there a way I can combine these into one? Like a regex conditional statement of sorts? Right now I am simply enumerating through all the available formats and breaking out once one is matched. I could always strip the hyphens out before I do the compare and only use [0-9]{9}, but then I won't be learning anything new.

+11  A: 

For a straight combine,

(?:[0-9]{2}[-][0-9]{3}[-][0-9]{4}|[0-9]{9})

or to merge the logic (allowing dashes in one position without the other, which may not be desired),

[0-9]{2}-?[0-9]{3}-?[0-9]{4}

(The brackets around the hyphens in your first regex aren't doing anything.)

Or merging the logic so that both hyphens are required if one is present,

(?:\d{2}-\d{3}-|\d{5})\d{4}

(Your [0-9]s can also be replaced with \ds.)

chaos
Second version allows one dash, which I believe is not the desired result.
cletus
Yeah, true. Noted in edit.
chaos
Thank you. Testing your examples here: http://www.fileformat.info/tool/regex.htm gave me the exact results I wanted. Also, thanks for the tips on [-] and \d.
northpole
Which version did you end up using?
chaos
I used: (?:\d{2}-\d{3}-|\d{5})\d{4}
northpole
A: 

I think this should work:

\d{2}-?\d{3}-?\d{4}
Philippe Leybaert
+6  A: 

How about using a backreference to match the second hyphen only if the first is given:

\d{2}(-?)\d{3}\1\d{4}

Although I've never used regexes in Java so if it's supported the syntax might be different. I've just tried this out in Ruby.

Nefrubyr
+1: I wouldn't use it for this case, but I like the tricksyness.
chaos
The syntax is the same in Java, but if you write it in the form of a String literal you'll have to escape the backslashes: "\\d{2}(-?)\\d{3}\\1\\d{4}"
Alan Moore
This would continue to work if you allowed `[-\\\/\|]`.
Brad Gilbert
+3  A: 

A neat version which will allow either dashes or not dashes is:

\d{2}(-?)\d{3}\1\d{4}

The capture (the brackets) will capture either '-' or nothing. The \1 will match again whatever was captured.

Callum
In addition, you can tell what format it was later by looking at $1 (at least in Perl). I'd never thought of that device.
Jon Ericson