tags:

views:

152

answers:

5

Hi,

I'm working on a program where I need to match a regular expression and a string. The string is pretty simple actually but I'm having problems with my current regex (I'm using the .net regex engine)

My current regular expression is "^[VvEeTtPp][^a-zA-Z0-9\s]\d{0,12}?$"

Now, the string I want to match always follows this pattern

  1. First a single letter (only letters allowed are V, E, P, T in either case)
  2. Then, a dash
  3. Finally from 4 to 12 digits.

There is a final restriction that is that the regex must match any substring that complies the rules (for example "V" or "E-" or "P-123")

The regex works fairly well, but it will accept things like "V--".

Could someone help me write a better expression??

Thanks

+2  A: 

Well, a substring of the 4-12 rule really just makes it a 1-12 rule, so how about:

        Regex re = new Regex(@"^[VvEeTtPp](-|-[0-9]{1,12})?$");
        Console.WriteLine(re.IsMatch("B"));
        Console.WriteLine(re.IsMatch("V"));
        Console.WriteLine(re.IsMatch("E-"));
        Console.WriteLine(re.IsMatch("P-123"));
        Console.WriteLine(re.IsMatch("V--"));
Marc Gravell
Works great will all test cases. Do you mind explaining to me what this part of the regex (-|-[0-9]{1,12})? does?
Kiranu
meant to say works great _with_ not will
Kiranu
"(-|-[0-9]{1,12})?" is the same as "(-\d{,12})?"; it matches a dash followed by up to 12 digits, zero or one times.
Guffa
(zero-or-one-of)(dash -or- dash followed by 1-12 digits)
Marc Gravell
@Guffa... yes, that would probably do it too. One of the problems with regex; you can write the same thing two-dozen ways.
Marc Gravell
+3  A: 

This should do it:

^[EPTVeptv](-(\d{4,12})?)?$

Edit:
To also match substrings like "P-123", "-123" and "123":

^(?=.)[EPTVeptv]?(-\d{,12})?$

Edit 2:
Added a positive lookahead in the beginning, so that the pattern doesn't match the substring "". Although that is a valid substring of a legal value, I assume that you don't want that specific substring...

Guffa
thanks for your quick answer, but the regex fails to match "p-" or "v-", etc.
Kiranu
Fails on P-123, which is stated as "should match"
Marc Gravell
@Gravell, as I stated in a comment to the question, I think that example is incorrect.
strager
Hm... Should it also match the substrings "-0" and "123"?
Guffa
I'd welcome the OP clarifying that, but it is a true substring of a matching pattern...
Marc Gravell
@Kiranu, I tested and that regexp does match "P-".
strager
@Gravell, Oh, you're right. Hmm...
strager
A: 

[VvEePpTt]-\d{4,12}

Elieder
Doesn't match "V" or "E-", and it does match "kjsdfhlaksjdft-9938472283kljsad457l435k7j43fha3457sdf"...
Guffa
A: 

Can you try this and tell me if it works?

^[VvEeTtPp](-(\d{4,12}){0,1}){0,1}$

It will accept a single character of the ones specified followed by either nothing or one dash which in turn is either not followed by 4-12 digits or 4-12 digits and matches them. For instance :

  • V
  • V-12
  • V12
  • V-12345
  • P-1234567890123

EDIT : Added a $ at the end so it will fail if the string contains any extra characters

Savvas Dalkitsis
The OP suggested (although I'm a little unclear) that "P-123" should match, which this doesn't. And you probably want a terminating $
Marc Gravell
it matches "V---------"
Kiranu
This just looks like a verbose version of Guffa's answer: '{0,1}' can be replaced with '?'
GApple
@Kiranu: it doesn't match V---------. It matches the first two cahracters which is a substring which you said you wanted...You should specify in more detail what you need.
Savvas Dalkitsis
@Kiranu: if you dont want to match them simply use a $ at the end.
Savvas Dalkitsis
+1  A: 

I think this pattern fits the specification.

string pattern = @"^[VvEePpTt](?:$|-(?:$|\d{1,12}$))";
// these are matches
Console.WriteLine(Regex.IsMatch("V", pattern));
Console.WriteLine(Regex.IsMatch("v-", pattern));
Console.WriteLine(Regex.IsMatch("P-123", pattern));
Console.WriteLine(Regex.IsMatch("t-012345678901", pattern));
// these are not
Console.WriteLine(Regex.IsMatch("t--", pattern));
Console.WriteLine(Regex.IsMatch("E-0123456789012", pattern));

Pattern breakdown:

^             - start of string
[VvEePpTt]    - any of the given characters, exactly once
(?:           - start a non-capturing group...
$|-           - ...that matches either the end of the string or exactly one hyphen
(?:           - start a new non-capturing group...
$|\d{1,12}$   - that matches either the end of the string or 1 to 12 decimal digits
))            - end the groups
Fredrik Mörk