views:

1776

answers:

4

my SSRS DataSet returns a field with HTML, e.g.

<b>blah blah </b><i> blah </i>.

how do i strip all the HTML tags? has to be done with inline VB.NET

Changing the data in the table is not an option.

thank you

Solution found ... = System.Text.RegularExpressions.Regex.Replace(StringWithHTMLtoStrip, "<[^>]+>","")

+2  A: 

Here's a good example using Regular Expressions: http://www.4guysfromrolla.com/webtech/042501-1.shtml

Daniel Jennings
A: 

If you know the HTML is well-formed enough, you could, if you make sure it has a root node, convert the data in that field into a System.Xml.XmlDocument and then get the InnerText value from it.

Again, you will have to make sure the text has a root node, which you can add yourself if needs be, since it will not matter, and make sure the HTML is well formed.

Jason Bunting
+2  A: 

Thanx to Daniel, but i needed it to be done inline ... here's the solution:

= System.Text.RegularExpressions.Regex.Replace(StringWithHTMLtoStrip, "<[^>]+>","")

here are the links:

http://weblogs.asp.net/rosherove/archive/2003/05/13/6963.aspx
http://msdn.microsoft.com/en-us/library/ms157328.aspx

roman m
A: 

If you don't want to use regular expressions (for example if you need better performance) you could try a small method I wrote a while ago, posted at CodeProject.

Andrei Rinea