tags:

views:

56

answers:

2

Hi,

Can you help me with a code snippet (with/without regex) to remove all span tags from a string like this: (Silverlight - c#)

<a href="#">
  <span class="uiTooltipWrap bottom left leftbottom">
    <span class="uiTooltipText">
      dasd dssa<br />
      adsa sssss
    </span>
  </span>
</a>

Thanks.

A: 

In Perl we might say:

s/
  <     # tag opening character
  \/?   # optional slash
  span
  [^>]* # any non tag-closing characters
  >     # tag closing character
/
        # nothing
/x;

and I'm sure you can translate this into a C# regular expression. I.e. replace anything that matches </?span[^>]*> with nothing.

PP
+2  A: 

HTMLAgilityPack is for you.

This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).

Ruel