views:

310

answers:

2

I'm looking for a simple PHP library that helps filter XSS vulnerabilities in PHP Markdown output. I.E. PHP Markdown will parse things such as:

[XSS Vulnerability](javascript:alert('xss'))

I've been doing some reading around and the best I've found on the subject here was this question.

Although HTML Purifier looks like the best (nearly only) solution I was wondering if there was anything out there more general? HTML Purifier seems to be a bit robust especially for my needs, as well as a pain to configure, though it looks like it'd work excellent after doing so.

Is there anything else out there that may be a little less robust and configurable but still do a solid job? Or should I just dig in and start trying to configure HTML Purifier for my needs?

EDIT FOR CLARITY: I'm not looking to cut corners or anything of the like. HTML Purifier just offers a lot of fine grained control and for a simple small project that much control just simply isn't needed, though using nothing isn't an option either. This is where I was coming from when asking for something simpler or less robust.

Also a final note, I'm NOT looking for suggestions to use htmlspecialchars(), strip_tags() or anything of the like. I already disallow imbedded HTML in PHP Markdown Extra by sanitizing it in a similar fashion. I'm looking for ways to prevent XSS vulnerabilities in PHP Markdown OUTPUT.

Thanks.

+1  A: 

I've never heard of any other tool than HTML Purifier, to do that -- and HTML Purifier does indeed have a good reputation.

Maybe it's "a bit robust" and "a pain to configure", yes ; but it's also probably the most used, and tested, solution available in PHP ;; and those are important criteria when you have to choose such an important component.

Even if it means investing half a day to configure it properly, if I were in your situation, I would probably choose HTML Purifier.

Pascal MARTIN
Yeah I suppose my comments on it came off as if I didn't care so much about it, which isn't true. I was just hoping there'd be something out there a bit more general. HTML Purifier gives you a lot of fine grained control over every aspect of it's self that I don't really need for my current project. Though using nothing or something extremely basic isn't an option either. You know?
anomareh
A: 

There is no such thing as too robust. “Sanitising” HTML is hard. Any corners you cut to process it more simply are likely to result in exploits sneaking through. Even complicated old HTMLPurifier, with its best-of-breed reputation, has had multiple ways of sneaking dangerous markup through in the past!

However, if your text-markup solution is capable of outputting dangerous HTML then it is deficient and should be replaced IMO. If PHP Markdown allows javascript: URLs through then that's a pretty lamentable, basic flaw and I don't think I'd trust it to get anything else right.

bobince
My issue with HTML Purifier was it gives you an immense amount of fine grained control over every aspect of it's self. I'm not looking to cut corners, just something a bit more simple when it comes to configuration as for a small project I'm working on I don't need that fine grained control. Though using nothing isn't an option either.As far as PHP Markdown goes AFAIK it's the only PHP implementation of Markdown I'm aware of and it's purpose is to parse Markdown into valid HTML. That link is valid HTML although it can be harmful. It doesn't advertise as being secure or security being a goal.
anomareh