This isn't as simple as you might thing because neither htmlspecialchars()
nor htmlentities()
provides any options to ignore certain tags (both functions don't even know the meaning of the notion of tags).
You could use some other means to allow the users to format their posts, e.g. BBCode, Textile or Markdown. There are PHP parsers available for all of them.
If you'll have to stick with html-tags you could resort to some preprocessing that reformats the allowed tags so that they will not be affected by htmlspecialchars()
. You can then postprocess the result to change back the format to normal HTML-tags. The following sample visualizes this process for a simple <a>
-tag. Please be aware that processing HTML with regular expressions is error-prone and not always the way to go - I'll use it just for the sake of simplicity in this example.
$input = preg_replace('~<(/?\w+([^>]*?))>~', '|#$1#|', $input);
$input = htmlspecialchars($input);
$inoput = preg_replace('~|#(/?\w+(.*?))#|~', '<$1>', $input);
This is untested and will surely require a lot more work.