I'm trying to understand how (?<name>pattern) works in Regex. Is there a good link or can someone offer a simple explanation?
From Mastering Regular Expressions:
Named capture:
\b(?<Area>\d\d\d\)-(?<Exch>\d\d\d)-(?<Num>\d\d\d\d)\b
This "fills the names" Area, Exch, and Num with the components of a US phone number. The program can then refer to each matched substring through its name, for example, RegexObj.Groups("Area") in VB.NET and most other .NET languages, RegexObj.Groups["Area"] in C#, RegexObj.group("Area") in Python, and $matches["Area"] in PHP. The result is clearer code.
Within the regular expression itself, the captured text is available via \k with .NET, and (?P=Area) in Python and PHP.
With Python and .NET (but not with PHP), you can use the same name more than once within the same expression.
This is normally used for back reference problems. See: http://www.regular-expressions.info/named.html
This functionality gives you possibility to easily refer to what you have caught from code (see an example) or from regex itself, using friendly name, instead of an index. A simple example:
Regex regex = new Regex(@"(?<foo>[fF][oO][oO]) \k<foo>");
foreach (Match match in regex.Matches("bar fOO foO foO f0O"))
{
Console.WriteLine(match);
}
Prints
foO foO
This regex allows you to catch a "foo" with any combination of capital-small letters, but only, if it is preceded by another "foo" with exact the same set of capital-small letters. You can also refer to your group using match.Groups["name"]
syntax, so in this example, match.Groups["foo"]
will return "foO".
Edit: New example that will use \k<name> syntax.