tags:

views:

357

answers:

3

I need to be able to extract a string between 2 tags for example: "00002" from "morenonxmldata<tag1>0002</tag1>morenonxmldata"

In using C# and .NET 3.5.

+5  A: 
  Regex regex = new Regex("<tag1>(.*)</tag1>");
  var v = regex.Match("morenonxmldata<tag1>0002</tag1>morenonxmldata");
  string s = v.Groups[1].ToString();

Or (as mentioned in the comments) to match the minimal subset:

  Regex regex = new Regex("<tag1>(.*?)</tag1>");
Aaron
Thanks for a fast reply.
Ashley
This is dangerous!On this strig:"aa<tag1>bbb</tag1>ccc<tag1>ddd</tag1>eee"it will return "bbb</tag1>ccc<tag1>ddd"
Kugel
@Aaron: use a non-greedy match by changing `(.*)` to `(.*?)` - this will prevent an incorrect match as mentioned by @Kugel.
Ahmad Mageed
+3  A: 

A RegEx-free solution:

string ExtractString(string s, string tag) {
     // You should check for errors in real-world code, omitted for brevity
     var startTag = "<" + tag + ">";
     int startIndex = s.IndexOf(startTag) + startTag.Length;
     int endIndex = s.IndexOf("</" + tag + ">", startIndex);
     return s.Substring(startIndex, endIndex - startIndex);
}
Mehrdad Afshari
+1  A: 

A Regex approach using lazy match and back-reference:

foreach (Match match in Regex.Matches(
        "morenonxmldata<tag1>0002</tag1>morenonxmldata<tag2>abc</tag2>asd",
        @"<([^>]+)>(.*?)</\1>"))
{
    Console.WriteLine("{0}={1}",
        match.Groups[1].Value,
        match.Groups[2].Value);
}
Marc Gravell