views:

1787

answers:

4

I have this regex I built and tested in regex buddy.

"_ [ 0-9]{10}+ {1}+[ 0-9]{10}+ {2}+[ 0-9]{6}+ {2}[ 0-9]{2}"

When I use this in .Net C#

I receive the exception

"parsing \"_ [ 0-9]{10}+ +[ 0-9]{10}+ +[ 0-9]{6}+ [ 0-9]{2}\" - Nested quantifier +."

What does this error mean? Apparently .net doesn't like the expression.

Here is the regex buddy so u can understand my intention with the regex...

_ [ 0-9]{10}+ {1}+[ 0-9]{10}+ {2}+[ 0-9]{6}+ {2}[ 0-9]{2}

Match the characters "_ " literally «_ »
Match a single character present in the list below «[ 0-9]{10}+»
   Exactly 10 times «{10}+»
   The character " " « »
   A character in the range between "0" and "9" «0-9»
Match the character " " literally « {1}+»
   Exactly 1 times «{1}+»
Match a single character present in the list below «[ 0-9]{10}+»
   Exactly 10 times «{10}+»
   The character " " « »
   A character in the range between "0" and "9" «0-9»
Match the character " " literally « {2}+»
   Exactly 2 times «{2}+»
Match a single character present in the list below «[ 0-9]{6}+»
   Exactly 6 times «{6}+»
   The character " " « »
   A character in the range between "0" and "9" «0-9»
Match the character " " literally « {2}»
   Exactly 2 times «{2}»
Match a single character present in the list below «[ 0-9]{2}»
   Exactly 2 times «{2}»
   The character " " « »
   A character in the range between "0" and "9" «0-9»

In short...

What is a Nested quantifier?

+4  A: 

.NET is complaining about the + after the {n} style quantifier as it doesn't make any sense. {n} means match exactly n of a given group. + means match one or more of a given group. Remove the +'s and it'll compile fine.

"_ [ 0-9]{10} {1}[ 0-9]{10} {2}[ 0-9]{6} {2}[ 0-9]{2}"

Duncan
+2  A: 

They're right. This version of your regex doesn't fail:

(_ [ 0-9]{10})+(\s{1})+([ 0-9]{10})+(\s{2})+([ 0-9]{6})+\s{2}[ 0-9]{2}

Notice the use of parens to create groups that then can repeat one or more times. Also, you should be more specific and use \s instead of a space, as pattern whitespace may or may not have significance.

BTW, this regex doesn't look all that useful. You might want to ask another question along the lines of "How do I use regex to match this pattern?"

Will
Well this is just a snippet the full regex is this..._ [0-9]{10} {1}[ 0-9]{10} {2}[ 0-9]{6} {2}[ 0-9]{2}|_ [ 0-9]{10} {1}[0-9]{10} {2}[ 0-9]{6} {2}[ 0-9]{2}|_ [ 0-9]{10} {1}[ 0-9]{10} {2}[ 0-9]{6} {2}[0-9]{2}Its returning fields as long as one isn't blank...and I like the \s idea. Thanks
ctrlShiftBryan
I'm almost positive that regex could be shrunk down bigtime. Seriously, ask a question about how to do it and provide some sample data.
Will
+3  A: 

.NET doesn't support the possessive quantifier

{10}+

However, {10} should have exactly the same effect. The + avoids backtracking and trying shorter matches if the longest match fails, but since {10} can only match exactly 10 characters to start with this doesn't achieve much.

"_ [ 0-9]{10} [ 0-9]{10} {2}[ 0-9]{6} {2}[ 0-9]{2}"

should be fine. I've also dropped the "{1}+" bit .Since it matches exactly once, "A{1}+" is equivalent to just "A".

stevemegson
Cool. Didn't realise some regex engines provided this option.
Duncan
+1  A: 

If you select the .NET flavor in the toolbar at the top in RegexBuddy, RegexBuddy will indicate that .NET does not support possessive quantifiers such as {10}+.

Since {10} allows only for one specific number of repetitions, making it lazy or possessive is pointless, even if it is syntactically valid in the regex flavors that support lazy and/or possessive quantifiers. Removing the + signs from your regex will make it work fine with .NET.

In other situations, double-click on the error about the possessive quantifier in the Create tab in RegexBuddy. RegexBuddy will then replace the possessive quantifier with a functionally equivalent atomic group.

If you generate a source code snippet for a .NET language on the Use tab in RegexBuddy, RegexBuddy will automatically replace possessive quantifiers in the regex in the source code snippet.

Jan Goyvaerts