ansaurus

Question

Is it a bad idea to escape HTML before inserting into a database instead of upon output?

Answer 1

+9 A:

Yes, because at some stage you'll want access to the original input entered. This is because...

You never know how you want to display it - in JSON, in HTML, as an SMS?
You may need to show it back to the user as is.

I do see your point about never wanting HTML entered. What are you using to strip HTML tags? If it a regex, then look out for confused users who might type something like this...

3 < 4 yes, :->

They'll only get the 3 and space if it is a regex.

alex 2010-09-06 00:41:05

I'm using `htmlentities()` in PHP, which (for example) turns `<` into `<`

a2h 2010-09-06 00:42:00

+1! I agree. Add to that the case where you change how you're doing your escaping, or you decide later that you want to allow certain tags like `<b>`, `<i>`, `<u>` and `<a>`. Escaping the data on the way out is future-proof.

mattmc3 2010-09-06 00:44:57

@a2h I tagged your question accordingly. I would use `htmlentities()` to display in your HTML, if that is how you wanted it displayed.

alex 2010-09-06 00:46:03

knittl 2010-09-06 00:53:19

@knittl Yeah, I'd use `htmlspecialchars()` too. I added the *if that is how you wanted it displayed* because he may want to have everything encoded to its entity.

alex 2010-09-06 00:55:31

Use of either or even both of these functions will not provide a magic bullet to protect you from XSS under all circumstances. See The excellent reference in Kittls answer that demonstrates the 6 different injection enviroments and the rules you need in each of those 6 different circumstances.

Cheekysoft 2010-09-06 12:23:28

Answer 2

+5 A:

you will also restrict yourself when performing the escaping before inserting into your db. let's say you decide to not use HTML as output, but JSON, plaintext, etc.

if you have stored escaped html in your db, you would first have to 'unescape' the value stored in the db, just to re-escape it again into a different format.

also see this perfect owasp article on xss prevention

knittl 2010-09-06 00:42:26

Answer 3

A:

I usually store both versions of the text. The escaped/formatted text is used when a normal page request is made to avoid the overhead of escaping/formatting every time. The original/raw text is used when a user needs to edit an existing entry, and the escaping/formatting only occurs when the text is created or changed. This strategy works great unless you have tight storage space constraints, since you will be duplicating data.

limscoder 2010-09-06 02:01:15

ansaurus

tags:

views:

answers:

Is it a bad idea to escape HTML before inserting into a database instead of upon output?

related questions