tags:

views:

289

answers:

4

How can I specify to only match the first occurrence of a regular expression in C# using the Regex method?

Here's an example:

        string text = @"<link href=""/_layouts/OracleBI/OracleBridge.ashx?RedirectURL=res/sk_oracle10/b_mozilla_4/common.css"" type=""text/css"" rel=""stylesheet""></link></link>";
        string pattern = @"(<link).+(link>)";
        Regex myRegex = new Regex(pattern, RegexOptions.IgnoreCase);

        Match m = myRegex.Match(text);   // m is the first match
        while (m.Success)
        {
            // Do something with m
            Console.Write(m.Value + "\n");
            m = m.NextMatch();              // more matches
        }
        Console.Read();

I would like this to only replace up to the first <\link>. And then also do the same for the rest of these matches.

Thanks.

+7  A: 

Regex.Match(myString) returns the first match it finds.

Subsequent calls to NextMatch() on the resultant object from Match() will continue to match the next occurrences, if any.

For example:

  string text = "my string to match";
  string pattern = @"(\w+)\s+";
  Regex myRegex = new Regex(pattern, RegexOptions.IgnoreCase);

  Match m = myRegex.Match(text);   // m is the first match
  while (m.Success)
  {
       // Do something with m

       m.NextMatch();              // more matches
  }


EDIT: If you're parsing HTML, I would seriously consider using the HTML Agility Pack. You will save yourself many, many headaches.

womp
this solution (with the addition of m = m.NextMatch();) still doesn't do the first match. Seems to find the last occurrence.
Josh
Here's an example:string text = @"<link href=""/_layouts/OracleBI/OracleBridge.ashx? RedirectURL=res/sk_oracle10/b_mozilla_4/common.css"" type=""text/css"" rel=""stylesheet""></link></link>";string pattern = @"(<link).+(link>)";
Josh
A: 

use grouping combined with RegExOptions.ExplicitCapture

Tim Mahy
+2  A: 

I believe you just need to add a lazy qualifier on the first example. Whenever a wild card is "eating too much", you either need a lazy qualifier on the wild card or, in a more complicated scenario, look ahead. Add a lazy qualifier at the top (.+? in place of .+), and you should be good.

Rich
A: 

Maybe a little over-simplified, but if you get a collection of matches back and want to get the first occurrence you could look at the Match.Index property to find the lowest index.

Here's the MSDN documentation on it.

If it is just a scope issue, then I agree with Rich's comment - you need to use non-greedy modifiers to stop your expression from 'eating' too much.

AJ