views:

38

answers:

2

I'm trying to run HTMLPurifier on user input from a WYSIWYG (CK Editor) and the images are breaking.

Unfiltered Input:

<img alt="laugh" src="/lib/ckeditor/plugins/smiley/images/teeth_smile.gif" title="laugh">

After running through purifier with default settings:

<img alt="&quot;laugh&quot;" src="%5C" title="&quot;laugh&quot;">

I have tried changing the configuration settings; but I the src is never preserved. Any thoughts?

+1  A: 

I don't know what htmlpurifier is, but the img tag you have there is perfectly legitimate (except it is unclosed) before running it. After you run it, it is doubly escaping things and that just seems like garbage. %5C is the url code for a backslash. Seems like it is trying to escape the forward slash with a backslash and then it chokes. What is this program? Can I recommend HTML Tidy?

tandu
Unary tags do not require a `/>` to close them in HTML; that's an XHTML idiom.
Stan Rogers
Why would you bother to make HTML that cannot be XHTML. All XHTML can be HTML. The reverse it not true because of the above, for example.
tandu
Thanks for your help, I started with HTML tidy but moved to purifier for XSS protection. As far as the html/xhtml I'm limited to what CK editor produces in this case. I may end up modifying it for a BB-code -like approach
pws5068
If you generate XHTML but send it with the wrong MIME type, then it is wrong HTML. A conformant SGML parser would display an <img/> as <img>/ with trailing slash. It just so happens that modern browsers have workarounds for HTML with trailing garbage (= which the slash is for text/html).
mario
+1  A: 

I have a suspicion that magic_quotes could be a reason..?

Also did you try $config->set('Core.RemoveInvalidImg',true);. Which version are you using? (Try older or newer)

mario
I'm using version 4.1.1. I tried using the suggested config setting but no luck. I'm not running any other sanitization routines on the string, it is being stored to the database through a MySqli prepared statement.
pws5068
I misread "magic quotes" for a call to mysql_escape_string. You were absolutely right, my php.ini file was set to automatically escape quotes with the directive: magic_quotes_gpc set to ON. Problem solved. Thank you!
pws5068