views:

861

answers:

4

I have a string that might look like this

$str = "<p>Me & Mrs Jones <br /> live in <strong style="color:#FFF;">España</strong></p>";
htmlentities($str,ENT_COMPAT,'UTF-8',false);

How can I convert the text to HTML entities without converting the HTML tags?

note: I need to keep the HTML intact

A: 

If you mean to convert only text, then try this:

$orig = "<p>Me & Mrs Jones <br /> live in <strong style="color:#FFF;">España</strong></p>";
$str = strip_tags($orig);

$str = htmlentities($str,ENT_COMPAT,'UTF-8',false);
Sarfraz
A: 

I haven't use htmlentities before, but it seems like a bit more robust version of urlencode (which I use a lot). You might want to try:

htmlentities(strip_tags($str,ENT_COMPAT),'UTF-8',false);

Just as a little nugget, if you want to preserve <br> as standard carrage returns, you could do this:

htmlentities(strip_tags(str_replace("<br>","\n",$str,ENT_COMPAT)),'UTF-8',false);

I know that's something I sometimes like to do.

Good Luck.

Carlson Technology
Matt Ellen
+1  A: 

Disclaimer: I would not encode any entities, except for <, > and &. That said, if you really want this, do this:

$str = '...';
$str = htmlentities($str,ENT_NOQUOTES,'UTF-8',false);
$str = str_replace(array('&lt;','&gt'),array('<','>'), $str);
Evert
I would go with this too, most of the times there isn't a need to encode " and '. And stuff like €, á, é should be handled by Unicode already.
Ivo Wetzel
Except this will fail when he has "2 > 5" in his markup
TravisO
Evert
A: 

The problem, that you face, is that under circumstances you already have encoded '<' and '>' in your text, so you have to filter them out after conversion.

This is similar to Evert's answer, but adds one more step to allow for content like 1 < 2 in your markup:

$str = htmlentities($str,ENT_NOQUOTES,'UTF-8',false);
$str = str_replace(array('&lt;','&gt'),array('<','>'), $str);
$str = str_replace(array('&amp;lt;','&amp;gt'),array('&lt;','&gt;'), $str);
Boldewyn