views:

58

answers:

1

I'm using the PHP markdown library: http://michelf.com/projects/php-markdown/ and the Javascript markdown library: http://attacklab.net/showdown/

I want to disallow all HTML, both the versions of markdown seem to allow it indiscriminately. My first attempt was simply to escape all html entities before feeding into markdown. However this also escapes the <hyperlink> and <email> syntax, which is very useful.

I'd like to escape all HTML (not remove) but preserve all markdown syntax.

+3  A: 

You have two options.

First, you can actually care about the HTML the user submits and do something about it. Try lib_filter by Cal Henderson (of Flickr fame) or maybe something more heavyweight like HTMLPurifier.

Second, if you really only want to neutralize all HTML but keep special syntax, use htmlspecialchars and then undo the conversions for the exact strings you're looking for with regexes. That might be a teeny bit more hairy. ;)

Yes, these are both PHP implementations, not Javascript. An ajax call to a special-purpose preview-generating script when the user stops typing for a moment should be speedy enough. Or make them hit a button.

Charles
I considered the regex option, but matching URLs and email address with a regex is kind of icky.I will check out HTMLPurifier, thanks.
peterjwest