I have the following VB.Net 2.0 in an ASP.Net app:
output = Regex.Replace(output, "<p>(?:(?:\<\!\-\-.*?\-\-\>)|&(?:nbsp|\#0*160|x0*A0);|<br\s*/?>|[\s\u00A0]+)*</p>", String.Empty, RegexOptions.Compiled Or RegexOptions.CultureInvariant Or RegexOptions.IgnoreCase Or RegexOptions.Singleline)
Example stuff it matches well:
<p></p>
<p> </p>
<p><br/><br/></p>
<p><!-- comment --><!-- comment --></p>
<p> </p>
<p><br/> </p>
<p><!-- comment --><br/><!-- comment --></p>
<p> <br/></p>
Examples of stuff I'd like to match but it doesn't:
<p > <!--[if !supportLineBreakNewLine]--><br /> <!--[endif]--></p>
How do I make the groups and repetitions work how I want them to?
Edit: oops, forgot the comment group. Edit #2: oops, forgot a fail. Edit #3: fixed examples. Edit #4: updated regex based on answers
Conclusion:
Here are my benchmarked results for all three answers. Since all three now match everything I ran each one through 10,000 iterations on a block of text:
Mine:
<p\s*>(?:(?:<!--.*?-->)|&(?:nbsp|\#0*160|x0*A0);|<br\s*/?>|[\s\u00A0]+)*</p>
6.312
Gumbo:
<p\s*>(?:[\s\u00A0]+|&(?:nbsp|\#0*160|x0*A0);|<br\s*/?>|<!--(?:[^-]+|-(?!-))*-->)*</p>
6.05
steamer25:
<p\s*>(?:(?:\ \;)|(?:\&\#0*160\;)|(?:<br\s*/?>)|\s|\u00A0|<!\-\-[^(?:\-\-)]*\-\->)*</p>
6.121
Gumbo's was the fastest, so I'll mark his as the correct answer.