tags:

views:

412

answers:

3

I have a sample string:

<num>1.</num> <Ref>véase anomalía de Ebstein</Ref> <num>2.</num> <Ref>-> vascularización</Ref>

I wish to make a comma seperated string with the values inside ref tags.

I have tried the following:

            Regex r = new Regex("<ref>(?<match>.*?)</ref>");
            Match m = r.Match(csv[4].ToLower());
            if (m.Groups.Count > 0)
            {
                if (m.Groups["match"].Captures.Count > 0)
                {
                    foreach (Capture c in m.Groups["match"].Captures)
                    {
                        child.InnerText += c.Value + ", ";       
                    }
                    child.InnerText = child.InnerText.Substring(0, child.InnerText.Length - 2).Replace("-> ", "");
                }
            }

But this only ever seems to find the value inside the first ref tag.

Where am I going wrong?

+3  A: 

You want to be using Matches rather than match to get all matches that occur, something like:

Regex r = new Regex("<ref>(?<match>.*?)</ref>");
foreach (Match m in r.Matches(csv[4]))
{
    if (m.Groups.Count > 0)
    {
     if (m.Groups["match"].Captures.Count > 0)
     {
      foreach (Capture c in m.Groups["match"].Captures)
      {
       child.InnerText += c.Value + ", ";
      }
      child.InnerText = child.InnerText.Substring(0, child.InnerText.Length - 2).Replace("-> ", "");
     }
    }
}
Wolfwyrd
A: 

Regex is often hungry, therefore it would match from the first tag to the last tag. If your XML is well formed, you can change to regex to something like:

Regex r = new Regex("<ref>(?<match>[^<]*?)</ref>");

To search for anything other than a <

ck
It's not *really* xml as such unfortunately.
qui
+1  A: 

I strongly recommend using XPath over regular expressions to search XML documents.

string xml = @"<test>
    <num>1.</num> <Ref>véase anomalía de Ebstein</Ref> <num>2.</num> <Ref>-> vascularización</Ref>
</test>";

XmlDocument d = new XmlDocument();
d.LoadXml(xml);

var list = from XmlNode n in d.SelectNodes("//Ref") select n.InnerText;
Console.WriteLine(String.Join(", ", list.ToArray()));
Robert Rossney
...and I strongly agree with your assertion.
Cerebrus