views:

139

answers:

4

Hey everyone,

I want to make a code snippet database web application. Would the best way to store it in the database be to html encode everything to prevent XSS when displaying the snippets on the web page?

Thanks for the help!

A: 

The best thing to do is to not store it in the database. I have seen people store stored procedures in databases as a row. Just because you can doesn't mean you should.

Woot4Moo
I don't think he means actually storing code for his site in the DB, but more storing code examples. Snippets of code people could use to figure out how to do something.
Slokun
thats an equally atrocious idea. If only there was some type of wiki style way of storing information thats relevant to your business application
Woot4Moo
Slokun, you are correct.
TheGNUGuy
http://www.wikispaces.com/
Woot4Moo
Nothing wrong with a BLOB/CLOB in a database. If you are using the right database and know how to tune it it will scale to millions of snippets.
mrjoltcola
+2  A: 

You would either have to escape it when you store it, or escape it when you display it. It'd probably be better to do it on display so that if you need to edit it later on, you don't have to decode it then re-encode it.

Also, you'll want to make sure you escape it properly when you store it in the database, otherwise you'd be leaving yourself open to SQL injection. Parameterized statements would be the best method, you shouldn't have to change the raw data at all.

Slokun
I agree that keeping it un-escaped in the DB is best. It also means that if you need to change the way it gets escaped (maybe you found a bug, or you want to add syntax-highlighting, say) then you won't have to go through and re-process everything already there. Add decent caching and you're good for performance, too.
Dean Harding
+1 for clear-text HTML in DB **AND** the reminder to be SQL-injection paranoid!
lexu
A: 

It doesn't matter how you store it, what matters is how you render it in the HTML representation. I'd guess you'll need to do some sort of sanitation before rendering the bytes. Another option might be to convert every character to an HTML entity; this might suffice to prevent any code or tags from actually being interpreted.

As an example, view the source of a Stack Overflow page with some example code, and see how they're representing the code in the HTML.

Avi Flax
+1  A: 

The database has nothing to do with this; you simply need to escape the snippets when they are rendered as HTML.

At minimum, you need to encode all & as &amp; and all < characters as &lt;.

However, your server-side language already has a built-in HTML encoding function; you should use it instead of re-inventing the wheel. For more details, please tell us what language your server-side code is in.

Based on your previous questions, I assume you're using PHP.
If so, you're looking for the htmlspecialchars or htmlentities functions.

SLaks
Yes I am using PHP.
TheGNUGuy
You should vote for answers that you find helpful by clicking the up arrows next to the answers.
SLaks
@SLaks At first, I laughed at this, but then I saw that he has 24 questions, and 0 votes. At least he has a 100% accept rate.
Kevin Crowell