views:

96

answers:

2

I am currently html encoding all user entered text before inserting/updating a db table record. The problem is that on any subsequent updates, the previously encoded string is reencoded. This endless loop is starting to eat up alot of column space in my tables. I am using parameterized queries for all sql statements but am wondering would it be safe to just let the .NET Framework handle this part without the HTML Encoding?

+1  A: 

I wouldn't recommend encoding the data in the database.

The encoding has nothing to do with the data but it specifically targetted at how you are displaying the data. What if you want a client app to use this data in the future or some other non-HTML display?

You should be storing the data as the raw data in your tables and the applications, or the layer that services applications should handle the encoding to whatever formats are required.

The .NET framework can easily do it for you. Just remember to use HtmlEncode or in ASP.NET 4 <%:. You should be doing this for ANY data that you need to present that is dynamic.

Storing it in the database encoded will not only cause you problems today but on going in the future.

Kelsey
First of all, thank you for the response. My concern is mainly multiline text fields. Based upon what you are saying I should only encode what I am showing to the end user and not in the db table. Would I then decode the value of a textarea before updating it in the db table?
Corin
@Corin if the data in the DB is not already encoded then you wouldn't need to do anything to it. Store it as is. When you display it in your HTML, then you do all the encoding to make it look proper and replace the line breaks. Are you dealing with data that already is encoded and now you have to deal with it or can you ensure than no encoded data gets in your DB?
Kelsey
I apparently reversed the encoding process. Meaning I encoded before saving it to the db tables. The issue I am trying to resolve in script injections. I know I can let the .NET Framework catch those but I was trying to add another layer to it. At this point, I can go into my code and fix it. I am just wondering if I encode a string value and place it into a text area, should I decode it before resaving that textarea value back to the table.
Corin
@Corin so it sounds like your stuck having some encoding. Can you run a script on your DB and just update the data with the decoded data? You need to get to a known state. It would be worse to have some data encoded and some not and trying to figure which is and isn't. Get to a known state and then go with the solution from there. If you have encoded data in your DB and you don't want to clean it then stick with it. I would recommend cleaning it and then just ensuring all data is NOT encoded in the DB.
Kelsey
+1  A: 

You should always HTML encode user data upon displaying, never upon storing. Save the user input in DB (using parametrized queries or whatnot to prevent SQL injection) and then HTML encode when outputting the data. That way you'll never have this problem.

HTML encoding is built into the ASP.NET framework real simply. This is how you do it:

<!-- ASP.NET 3.5 and below -->
<%= Html.Encode(yourStuff) %>

<!-- ASP.NET 4 -->
<%: yourStuff %>
Tomas Lycken
Thank you for the quick response. Would I then decode the already encoded string before updating the current record I am viewing?
Corin
@Corin, Actually, I think the best (albeit maybe not the easiest *right now*) option is to "sanitize" your database from encoded data, and try to restore the raw input also for old records. That way, you won't have to worry about this in the future.
Tomas Lycken