Is there a PHP implementation of markdown suitable for using in public comments?
Basically it should only allow a subset of the markdown syntax (bold, italic, links, block-quotes, code-blocks and lists), and strip out all inline HTML (or possibly escape it?)
I guess one option is to use the normal markdown parser, and run the output through an HTML sanitiser, but is there a better way of doing this..?
We're using PHP markdown Extra for the rest of the site, so we'd already have to use a secondary parser (the non-"Extra" version, since things like footnote support is unnecessary).. It also seems nicer parsing only the *bold*
text and having everything escaped to <a href="etc">
, than generating <b>bold</b>
text and trying to strip the bits we don't want..
Also, on a related note, we're using the WMD control for the "main" site, but for comments, what other options are there? WMD's javascript preview is nice, but it would need the same "neutering" as the PHP markdown processor (it can't display images and so on, otherwise someone will submit and their working markdown will "break")
Currently my plan is to use the PHP-markdown -> HTML santiser method, and edit WMD to remove the image/heading syntax from showdown.js
- but it seems like this has been done countless times before..
Basically:
- Is there a "safe" markdown implementation in PHP?
- Is there a HTML/javascript markdown editor which could have the same options easily disabled?
Update: I ended up simply running the markdown()
output through HTML Purifier.
This way the Markdown rendering was separate from output sanitisation, which is much simpler (two mostly-unmodified code bases) more secure (you're not trying to do both rendering and sanitisation at once), and more flexible (you can have multiple sanitisation levels, say a more lax configuration for trusted content, and a much more stringent version for public comments)