tags:

views:

2202

answers:

3

I'm having a hard time finding a good resource that explains how to use Named Capturing Groups in C#. This is the code that I have so far:

string page = Encoding.ASCII.GetString(bytePage);
Regex qariRegex = new Regex("<td><a href=\"(?<link>.*?)\">(?<name>.*?)</a></td>");
MatchCollection mc = qariRegex.Matches(page);
CaptureCollection cc = mc[0].Captures;
MessageBox.Show(cc[0].ToString());

However this always just shows the full line:

<td><a href="/path/to/file">Name of File</a></td>

I have experimented with several other "methods" that I've found on various websites but I keep getting the same result.

How can I access the named capturing groups that are specified in my regex?

+7  A: 

Use the group collection of the Match object, indexing it with the capturing group name, e.g.

foreach (Match m in mc){
    MessageBox.Show(m.Groups["link"]);
}
Paolo Tedesco
+1  A: 

You specify the named capture group string by passing it to the Groups property of a resulting Match object.

Here is a small example:

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
     String sample = "hello-world-";
     Regex regex = new Regex("-(?<test>[^-]*)-");

     Match match = regex.Match(sample);

     if (match.Success)
     {
      Console.WriteLine(match.Groups["test"].Value);
     }
    }
}
Andrew Hare
+1  A: 

The following code sample, will match the pattern even in case of space characters in between. i.e. :

<td><a href='/path/to/file'>Name of File</a></td>

as well as:

<td> <a      href='/path/to/file' >Name of File</a>  </td>

Method returns true or false, depending on whether the input htmlTd string matches the pattern or no. If it matches, the out params contain the link and name respectively.

/// <summary>
/// Assigns proper values to link and name, if the htmlId matches the pattern
/// </summary>
/// <returns>true if success, false otherwise</returns>
public static bool TryGetHrefDetails(string htmlTd, out string link, out string name)
{
    link = null;
    name = null;

    string pattern = "<td>\\s*<a\\s*href\\s*=\\s*(?:\"(?<link>[^\"]*)\"|(?<link>\\S+))\\s*>(?<name>.*)\\s*</a>\\s*</td>";

    if (Regex.IsMatch(htmlTd, pattern))
    {
        Regex r = new Regex(pattern,  RegexOptions.IgnoreCase | RegexOptions.Compiled);
        link = r.Match(htmlTd).Result("${link}");
        name = r.Match(htmlTd).Result("${name}");
        return true;
    }
    else
        return false;
}

I have tested this and it works correctly.

Rashmi Pandit