Is there a regex flavor that allows me to count the number of repetitions matched by the *
and +
operators? I'd specifically like to know if it's possible under the .NET Platform.
views:
83answers:
3how about taking "pref ([a-z]+) suff"
then use groups to capture that [a-z]+ in the bracket and find its length?
You can use this length for subsequent matching as well.
You can use parentheses in the expression to create a group and then use the +
or *
operator on the group. The Captures
property of the Group
can be used to determine how many times it was matched. The following example counts the number of consecutive lower-case letters at the start of a string:
var regex = new Regex(@"^([a-z])+");
var match = regex.Match("abc def");
if (match.Success)
{
Console.WriteLine(match.Groups[1].Captures.Count);
}
You're fortunate because in fact .NET regex does this (which I think is quite unique). Essentially in every Match
, each Group
stores every Captures
that was made.
So you can count how many times a repeatable pattern matched an input by:
- Making it a capturing group
- Counting how many captures were made by that group in each match
- You can iterate through individual capture too if you want!
Here's an example:
Regex r = new Regex(@"\b(hu?a)+\b");
var text = "hahahaha that's funny but not huahuahua more like huahahahuaha";
foreach (Match m in r.Matches(text)) {
Console.WriteLine(m + " " + m.Groups[1].Captures.Count);
}
This prints (as seen on ideone.com):
hahahaha 4
huahuahua 3
huahahahuaha 5