tags:

views:

276

answers:

4

I need to encode a whole text while leaving the < and > intact.

example

<p>Give me 100.000 €!</p>

must become:

<p>Give me 100.000 &euro;!</p>

the html tags must remain intact

+1  A: 

Maybe use string.replace for just those characters you want to encode?

Stefan
This has to be the best approach. It looks like the OP is trying to encode non-ASCII entities (probably to get around character set issues). For that, a regex matching >127 and replacing with known entity names would be best. Or better yet, sorting out he charset issue if that's the underlying problem. :-)
T.J. Crowder
or the other way around. First encode everything and the < en > back to < and >
MichaelD
+1  A: 

you might go for Html Agility Pack and then encode the values of the tags

Andreas Niedermair
+1 - I think this solves most of the issues with trying to do it with replace or regex, and it's probably less work than creating your own whitelist of tags to ignore or characters to replace.
RedFilter
A: 

You could use HtmlTextWriter in addition to htmlencode. So you would use HtmlTextWriter to setup your <p></p> and then just set the body of the <p></p> using HtmlEncode. HtmlTextWriter allow ToString(); and a bunch of other methods so it shouldn't be much more code.

RandomBen
A: 

Use a regular expression that matches either a tag or what's between tags, and encode what's between:

html = Regex.Replace(
  html,
  "(<[^>]+>|[^<]+)",
  m => m.Value.StartsWith("<") ? m.Value : HttpUtility.HtmlEncode(m.Value)
);
Guffa