views:

411

answers:

5

I'm trying to protect myself from sql injection and am using:

mysql_real_escape_string($string);

When posting HTML it looks something like this:

<span class="\&quot;className\&quot;">
<p class="\&quot;pClass\&quot;" id="\&quot;pId\&quot;"></p>
</span>

I'm not sure how many other variations real_escape_string adds so don't want to just replace a few and miss others... How do I "decode" this back into correctly formatted HTML, with something like:

html_entity_decode(stripslashes($string));
+3  A: 

The mysql_real_escape_string() manual page tells you which characters are escaped:

mysql_real_escape_string() calls MySQL's library function mysql_real_escape_string, which prepends backslashes to the following characters: \x00, \n, \r, \, ', " and \x1a.

You could successfully reverse the escaping by replacing those escaped characters with their unescaped forms.

mysql_real_escape_string() shouldn't be used to sanitize HTML though... there's no reason to use it before outputting web page data. It should only be used on data that you're about to put into the database. Your sanitization process should look something like this:

Input

  1. Accept user input from a form or HTTP request
  2. Create database query using mysql_real_escape_string()

Output

  1. Fetch data out of the database
  2. Run any user-defined data through htmlspecialchars() before printing

Using a different database driver such as MySQLi or PDO will allow you to use prepared statements, which take care of escaping most inputs for you. However, if you can't switch or take advantage of those, then definitely use mysql_real_escape_string()... just only use it before inserting data.

zombat
I would recommend prepared statements (e.g. http://www.php.net/manual/en/class.pdostatement.php) over `mysql_real_escape_string`. And `htmlspecialchars` is not always the right choice. Sometimes white-listing is a better option.
Matthew Flaschen
+2  A: 

You messed everything up.
mysql_real_escape_string don't need any decoding.
and it must be done to data that goes to the query. this operation bound to the query, not to user input. Nothing else must be done between escaping and query building. So, it must be like this:

$html=mysql_real_escape_string($html);

nothing here

$query="INSERT INTO table SET html='$html'";

But you're probably added some functions like htmlspecialchars, so you end up with such a mess.

So, to solve your problem:

  1. Decode nothing.
  2. Encode only those things needs to be encoded and when it needs to be encoded.

If you going to output HTML as is - do not replace any quotes with entities.

Col. Shrapnel
$query="INSERT INTO table SET html='$html'"; is not standard SQL, you'd better use INSERT INTO table (html) VALUES('content'); This works in all databases, not just MySQL.
Frank Heikens
@Frank Heikens But I am working with mysql. And I use tons of mysql specific features. Go tell me not to use PHP as it not supported everywhere. What a nonsense comment!
Col. Shrapnel
+3  A: 

mysql_real_escape_string is used to prevent SQL injection when storing user provided data into the database, but a better method would be to use data binding using PDO (for example). I always recommend using that instead of messing with escaping.

That being said, regarding your question on how to display it afterwards - after the data is stored, when you retrieve it the data is complete and valid without any need to be "unescaped". Unless you added your own escaping sequences, so please don't do that.

Guss
A: 

I think a number of other answers missed the obvious issue...

You are using mysql_real_escape_string on the inputted content (as you should if not using prepared statements).

Your issue is with the output.

The current issue is that you are calling html_entity_decode. Just stripslashes is all you need to restore the original text. html_entity_decode is what is messing up your quotes, etc, as it is changing them. You actually want to output the html, not just plain text (which is when you would use html_entities, etc). You are decoding something you want encoded.

If you only want the text version to show up, you can use the entities. If you are worried about bad tags, use striptags and allow only the tags you want (such as b, i, etc).

Finally, remember to encode and decode in the proper order. if you ran mysql_real_escape_String(htmlentities($str)), then you need to run html_entity_decode(stripslashes($str)). The order of operations matters.

UPDATE: I did not realize that html_entity_decode also strips out slashes. It was not clearly documented on that page, and I just never caught it. I will still automatically run it though, as most html that I present I want left as entities, and even when I don't, I prefer to make that decision outside of my db class, on a case by case basis. That way, I know the slashes are gone.

It appears the original poster is running htmlentities (or his input program, like tinymce is doing it for him), and he wants to turn it back to content. So, html_entity_decode($Str) should be all that is required.

Cryophallion
you are wrong. he need not to strip slashes. Ne need to add it properly. Do cure decease, not symptom.
Col. Shrapnel
He DOES need to strip the slashes out, as he ran escape string on it first. He encoded it, now he needs to decode it to get rid of the slashes in the output. Hence the \ appearing before the ".
Cryophallion
You have no clue how the thing works. So better ban yourself from answering until you learn some. No stripping needed. Try it yourself.
Col. Shrapnel
Finally, remember to encode and decode in the proper order. if you ran mysql_real_escape_String(htmlentities($str)), **no decodeing action required**. Go figure. if you don't want entities - just do not encode them. If you did - why to decode?
Col. Shrapnel
My database class runs real_escape_string before every insert. In order to get rid of the slashes in front of the quotes, I have to run stripslashes, or the slashes are escaped. I absolutely know how it works - I write this stuff all the time. He's trying to get his html back, there are slashes. How other than stripslashes do you plan on doing this without overcomplicating it? He needs to make sure he has the entities back first though!The issues is he's running strip first, he should be running decode first.
Cryophallion
`I have to run stripslashes, or the slashes are escaped. I absolutely know how it works ` ahahahahaha! what a funny comment :) Go turn magic quotes off, and read some manual entry on it. Your question was asked recently http://stackoverflow.com/questions/2573150/is-the-backslash-counted-as-a-character-in-mysql
Col. Shrapnel
That question has no reference at all to this (it is not about whether slashes take up characters in mysql). A. I don't use magic quotes B. I stand corrected on needing to use stripslashes on html_entity_decode. It is not clearly documented on the manual page, and it is nice to know. HOWEVER, it is not necesarily a bad thing to get into the habit of doing (running stripslashes before anything else), in case you want the option of making the entities converted or not.
Cryophallion
A: 

Not sure what is going on with the formatting as I can see it but your html form

<span class="\&quot;className\&quot;">
<p class="\&quot;pClass\&quot;" id="\&quot;pId\&quot;"></p>
</span>

should be simply;

<span class="className">
<p class="pClass" id="pId"></p>
</span>

When you get it back, before you put it into the database you escape it using mysql_real_escape_string() to make sure you do not suffer an sql injection attack.

Hence you are escaping the values ready for place the text is going next.

When you get it out of the database ( or display ANY of it to users as html) then you escape it again ready for that that place it is going next (html) with htmlentities() etc to protect your users from XSS attacks.

This forms the EO part of the mantra FIEO, Filter Input, Escape Output, which you should tatoo on the inside of your eyelids.

Cups
Are you sure he wants this form to be escaped? A i'm in deep doubts. If someone uses HTML formatting, they usually want it working, not as visible tags.
Col. Shrapnel