ansaurus

Question

Alternative of html purifier

Answer 1

+3 A:

You can try PHP Tidy, which is the Tidy library in PHP.

Vivin Paliath 2010-10-28 22:31:36

I imagine it should. Looking at the installation page, it says that this module comes bundled with PHP >=5.

Vivin Paliath 2010-10-28 23:11:45

Answer 2

A:

I believe Tidy will help close your tags, but it isn't as comprehensive as HTML Purifier which can remove valid but unwanted tags or attributes (i.e. JavaScript onclick events, that kind of thing).

Be aware that Tidy requires libtidy to be installed on your server, so it's not just straight PHP.

I know Pádraic Brady has been working on an alternative to HTML Purifier for Zend Framework, though I think its just experimental code at this time

http://framework.zend.com/wiki/pages/viewpage.action?pageId=25002168

http://github.com/padraic/wibble

simonrjones 2010-10-28 22:41:53

I tried it . but it has a lot of bugs.

Vivek Goel 2010-10-28 23:25:25

shame. I'd recommend either try to get HTML Purifier working, or try Tidy.

simonrjones 2010-10-30 20:34:01

Answer 3

+1 A:

Simple solution without third-party libraries: create a DOMDocument and call loadHTML on it with your input. Surrounded the input with <html> and <body> tags if you are only parsing a little snippet. You'll probably want to suppress warnings too, as you'll get them spat out for common bad HTML.

Then simply walk over the resulting document tree, removing any elements and attributes you've not included in a known-good list. You should also check allowed URL attributes to ensure they use known-good schemes like http:, and not potentially troublesome schemes like javascript:. If you want to go the extra mile you can check that only allowed combinations of elements are nested inside each other (this is easier the smaller number of elements you're allowing).

Finally, serialise the snippet's node again using saveHTML. Because you're creating new markup from a DOM, not maintaining the original—potentially malformed—markup, that's a whole class of odd-markup injection techniques you're blocking.

bobince 2010-10-28 23:05:16

ansaurus

tags:

views:

answers:

Alternative of html purifier

related questions