Using regex is not the correct way to do this. As others have pointed out, use an HTML parser. If you have HTML Agility Pack, you can do this:
using System;
using System.Linq;
using System.Text.RegularExpressions;
using HtmlAgilityPack;
class Program
{
static void Main(string[] args)
{
string html = @"<html><body><td class=""blah"" ...........>Some text blah: page 13 of 99<br> more stuff</td></body></html>";
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
var nodes = doc.DocumentNode.SelectNodes("//td[@class='blah']");
if (nodes != null)
{
var td = nodes.FirstOrDefault();
if (td != null)
{
Match match = Regex.Match(td.InnerText, @"page \d+ of (\d+)");
if (match.Success)
{
Console.WriteLine(match.Groups[1].Value);
}
}
}
}
}
Output:
99
However, it can be done with regex, as long as you accept that it's not going to be a perfect solution. It's fragile, and can easily be tricked, but here it is:
class Program
{
static void Main(string[] args)
{
string s = @"stuff <td class=""blah"" ...........>Some text blah: page 13 of 99<br> more stuff";
Match match = Regex.Match(s, @"<td[^>]*\sclass=""blah""[^>]*>[^<]*page \d+ of (\d+)<br>");
if (match.Success)
{
Console.WriteLine(match.Groups[1].Value);
}
}
}
Output:
99
Just make sure no-one ever sees you do this.