tags:

views:

138

answers:

5

I have the need to check whether strings adhere to a particular ID format.

The format of the ID is as follows:

aBcDe-fghIj-KLmno-pQRsT-uVWxy

A sequence of five blocks of five letters upper case or lower case, separated by one dash.

I have the following regular expression that works:

string idFormat = "[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}[-]{1}[a-zA-Z]{5}";

Note that there is no trailing dash, but the all of the blocks within the ID follow the same format. Therefore, I would like to be able to represent this sequence of four blocks with a trailing dash inside the regular expression and avoid the duplication.

I tried the following, but it doesn't work:

string idFormat = "[[a-zA-Z]{5}[-]{1}]{4}[a-zA-Z]{5}";

How do I shorten this regular expression and get rid of the duplicated parts?

What is the best way to ensure that each block does also not contain any numbers?


Edit:

Thanks for the replies, I now understand the grouping in regular expressions.

I'm running a few tests against the regular expression, the following are relevant:

Test 1: aBcDe-fghIj-KLmno-pQRsT-uVWxy
Test 2: abcde-fghij-klmno-pqrst-uvwxy

With the following regular expression, both tests pass:

^([a-zA-Z]{5}-){4}[a-zA-Z]{5}$

With the next regular expression, test 1 fails:

^([a-z]{5}-){4}[a-z]{5}$

Several answers have said that it is OK to omit the A-Z when using a-z, but in this case it doesn't seem to be working.

+7  A: 

You can try:

([a-z]{5}-){4}[a-z]{5}

and make it case insensitive.

codaddict
Nice, I didn't think of the case insensitive.
Benjol
Make sure you do `^ $` at the beginning/end otherwise this would match on `1651651aBcDe-fghIj-KLmno-pQRsT-uVWxy1625361$%4g$%^£$48` ;)
Gary Green
A: 

Try

string idFormat = "([a-zA-Z]{5}[-]{1}){4}[a-zA-Z]{5}";

I.e. you basically replace your brackets by parentheses. Brackets are not meant for grouping but for defining a class of accepted characters.

However, be aware that with shortened versions, you can use the expression for validating the string, but not for analyzing it. If you want to process the 5 groups of characters, you will want to put them in 5 groups:

string idFormat =
    "([a-zA-Z]{5})-([a-zA-Z]{5})-([a-zA-Z]{5})-([a-zA-Z]{5})-([a-zA-Z]{5})";

so you can address each group and process it.

chiccodoro
No way, it was just swapping the square brackets for the parentheses?
fletcher
+1  A: 

This works for me, though you might want to check it:

[a-zA-Z]{5}(-[a-zA-Z]{5}){4}

(One group of five letters, followed by [dash+group of five letters] four times)

Benjol
A: 
([a-zA-Z]{5}[-]{1}){4}[a-zA-Z]{5}
TimS
+6  A: 

If you can set regex options to be case insensitive, you could replace all [a-zA-Z] with just plain [a-z]. Furthermore, [-]{1} can be written as -.

Your grouping should be done with (, ), not with [, ] (although you're correctly using the latter in specifying character sets.

Depending on context, you probably want to throw in ^...$ which matches start and end of string, respectively, to verify that the entire string is a match (i.e. that there are no extra characters).

In javascript, something like this:

/^([a-z]{5}-){4}[a-z]{5}$/i
David Hedlund
Please check the edited post, I'm having trouble getting the regex working with just the a-z as opposed to a-zA-Z
fletcher
@fletcher: you need to *specify* that the regex should be case insensitive. how you do that depends on what language you're doing this in. in javascript, it is with the `i` as in my example, in C# it is `new Regex(pattern, RegexOptions.IgnoreCase);` There are also cases where you *cannot* make it case insensitive, such as ASP.NET RegexValidator controls, that don't implement support for RegexOptions, and possibly some programming languages that might not support it at all.
David Hedlund
Problem solved. Thanks David
fletcher
Alternatively you could use `^(i)([a-z]{5}-){4}[a-z]{5}$` where (i) turns on case insensitivity
Gary Green
@Gary: That should be `(?i)`, not `(i)`. Also, it's not supported in JavaScript, which may or may not be relevant, given that OP hasn't specified a flavor.
Alan Moore