ansaurus

Question

.net Regular Expression involving html tags

Answer 1

+2 A:

Replace "<[^>]*>" with the empty string, trim the result and check if there is anything left afterwards.

Tomalak 2009-03-04 17:23:18

Thanks for the quick response, I used this method and it worked.

2009-03-04 19:08:59

Answer 2

A:

I once used this to strip out html tags:

const string tagsPatterns = "\\s*<.*?>\\s*"; 
value = System.Text.RegularExpressions.Regex.Replace(value, tagsPatterns, " ");

I guess you can play with it a bit (this version wanted to keep white spaces), to get the string with no tags, and check if it isn't empty

Update 1: Here it goes :)

bool HasText(string value)
{
    const string tagsPatterns = "<.*?>"; 
    value = System.Text.RegularExpressions.Regex.Replace(value, tagsPatterns, "");
    return value.Trim() != "";
}
[TestMethod]
public void TestMethod2()
{
    Assert.IsFalse(HasText("<html></html>"));
    Assert.IsTrue(HasText("<html>Hi</html>"));
    Assert.IsTrue(HasText("<a href='google.com'>Click Me</a>"));
    Assert.IsTrue(HasText("hello"));
    Assert.IsFalse(HasText("<bold><italics></bold></italics>"));
    Assert.IsFalse(HasText(""));
}

eglasius 2009-03-04 17:24:51

Answer 3

A:

Here's an article written by Phil Haack about using a regular express to match html.

Also, if you want a simple line of code, consider loading the string into an XmlDocument. It would parse it so you'll know if you have valid xml or not.

ajma 2009-03-04 17:24:54

I believe you misunderstood the question a bit.

Tomalak 2009-03-04 17:28:05

ansaurus

tags:

views:

answers:

.net Regular Expression involving html tags

related questions