views:

83

answers:

3

Is there a regex flavor that allows me to count the number of repetitions matched by the * and + operators? I'd specifically like to know if it's possible under the .NET Platform.

A: 

how about taking "pref ([a-z]+) suff"

then use groups to capture that [a-z]+ in the bracket and find its length?

You can use this length for subsequent matching as well.

Lie Ryan
not applicable to repetition of general pattern (see my answer for example), but obviously if the pattern matches exactly one character this would work
polygenelubricants
+3  A: 

You can use parentheses in the expression to create a group and then use the + or * operator on the group. The Captures property of the Group can be used to determine how many times it was matched. The following example counts the number of consecutive lower-case letters at the start of a string:

var regex = new Regex(@"^([a-z])+");
var match = regex.Match("abc def");

if (match.Success)
{
    Console.WriteLine(match.Groups[1].Captures.Count);
}
Phil Ross
+1; I also added an example where the answer is not the same as match string length.
polygenelubricants
+5  A: 

You're fortunate because in fact .NET regex does this (which I think is quite unique). Essentially in every Match, each Group stores every Captures that was made.

So you can count how many times a repeatable pattern matched an input by:

  • Making it a capturing group
  • Counting how many captures were made by that group in each match
    • You can iterate through individual capture too if you want!

Here's an example:

Regex r = new Regex(@"\b(hu?a)+\b");

var text = "hahahaha that's funny but not huahuahua more like huahahahuaha";
foreach (Match m in r.Matches(text)) {
   Console.WriteLine(m + " " + m.Groups[1].Captures.Count);
}

This prints (as seen on ideone.com):

hahahaha 4
huahuahua 3
huahahahuaha 5

API references

polygenelubricants
See also http://stackoverflow.com/questions/2250335/differences-among-net-capture-group-match and http://stackoverflow.com/questions/3320823/whats-the-difference-between-groups-and-captures-in-net-regular-expressions
polygenelubricants