views:

246

answers:

5

When adding user input to a web page, it should (unless it's HTML of course :) be encoded to help prevent XSS attacks etc.. like this:

litForename.Text = HttpUtility.HtmlEncode(MyUser.Forename);

I'm putting together a template to generate my business logic layer, and I'm thinking of using it to do all the encoding as soon as the data comes out of the database, before it gets to the UI code. This will ensure that everything is encoded that should be (I'd obviously exclude the columns that contain Xhtml/Xml strings). An overload on the data access methods will allow retrieval of data with no encoding (so it can be edited):

// Get a 'User' entity with all the string fields HTML encoded
BLL.Users.GetById(int userId)

// Get a 'User' entity with optional HTML encoding
BLL.Users.GetById(int userId, bool useHtmlEncoding)

Is this an approach that anyone else uses, or is it a dumb idea? What are the pros and cons?

Thanks.

+3  A: 

There may be edge cases where this makes sense but in general I would advise against this. Your business logic layer should deal with business logic and business logic only.

Likewise your controllers (assuming ASP.NET MVC) should be dealing with values that make sense in the context of your business domain, rather than values already altered in anticipation of a particular type of UI.

Your UI layer is the only layer which should know and care about what type of UI it is. It may seem, at the moment, that your only type of UI is going to be HTML based but that may change.

AdamRalph
I concur - thanks for sanity check.
Nick
+1  A: 

The problem with using HtmlEncode on data that gets saved to the database is that you then have to deal with things like & and " in your data. For example, "Tom O'Brien" is going to be saved to the database as "Tom O"Brien". Doing a SELECT or UPDATE on that is gonna be tricky.

I think you will do better by only using HtmlEncode for displaying text in the UI.

DOK
I was thinking of encoding the data on the way out only - everything going into the database will be stored as is.
Nick
A: 

I agree with the other posters that view-level data conversion belongs in the view generation. You may start out with only XML-based views (eg, XHTML, VoiceXML for voice browsing, XML for web services), but what happens when you decide you also need JSON views to support AJAX interactions? JSON Javascript literals use a different escape mechanism than XML.

You'll also have cases where one logic tier method needs to call another for a purpose unrelated to view generation. Perhaps the calling method needs to apply some bulk data transformation that populates another database table. The calling method would have to undo the XML escapes in that case.

Jim Ferrans
+1  A: 

Your business logic really shouldn't know about your presentation. Whether your feeding a web, windows or any other type of UI, you shouldn't have those details in your business logic.

Have you considered that people using your business layer might try to encode the data again on top of your encoding? That could cause things too look really messy.

Kelsey
A: 

Learn the lessons of PHP's magic_quotes_gpc feature: such encoding will doubtless only confuse things more, cause you to unescape when you shouldn't, forget to escape when you should, and generally be a pain. Don't encode your data until it's about to be sent where it needs to go, be it the database, the Web, or somewhere else.

Paul Fisher