views:

314

answers:

5

I am looking at starting a hosted CMS-like service for customers.

As it would, it would require the customer to input text which would be served up to anyone that comes to visit their site. I am planning on using Markdown, possibly in combination with WMD (the live markdown preview that SO uses) for the big blocks of text.

Now, should I be sanitizing their input for html? Given that there would only be a handful of people editing their 'CMS', all paying customers, should i be stripping out the bad HTML, or should I just let them run wild? After all, it is their 'site'

Edit: The main reason as to why I would do it is to let them use their own javascript, and have their own css and divs and what not for the output

+10  A: 

Why wouldn't you sanitize the input?

If you don't, you're inviting calamity - to either your customer or yourself or both.

warren
If he's allowing 3rd party Javascript, can anything truly be safe?
Joe Skora
perhaps not, but he can do his part to help :)
warren
+1  A: 

At least parse their entry an only allow a certain "safe" subset of HTML tags.

Ken Ray
A: 

I think you should always sanitize the input. Most people use a CMS because they don't want to create their own website from scratch and they want easy access to edit their pages. These users most likely will not be trying to put in text that would get sanitized, but by protecting against it you are protecting their users.

Joshua Hudson
+1  A: 

You would also be protecting again disgruntled employees, cross customer attacks, or any other sort of idiotic behavior.

You should always sanitize, no matter the users or viewers.

Carlton Jenke
+1  A: 

Your question asks:

"Edit: The main reason as to why I would do it is to let them use their own javascript, and have their own css and divs and what not for the output".

If you allow users to supply arbitrary JavaScript, then sanitizing input is not worth the effort. The definition of Cross-Site Scripting (XSS) is basically "users can supply JavaScript and some users are bad".

Now, some websites do allow users to supply JavaScript and they mitigate the risk in one of two ways:

  1. Host the individual user's CMS's under a different domain. Blogger and Tumblr (e.g. myblog.blogspot.com vs. blogger.com) do this to prevent user's templates from stealing other user's cookies. You have to know what you are doing and never host any of the user content under the root domain.
  2. If user content is never shared between users then it does not matter what script malicious users supply. However, CMS's are about sharing so this probably doesn't apply here

There are some Blacklist filters out there that may work, but they only work today. The HTML spec and browsers change regularly which makes filters almost impossible to maintain. Blacklisting is a sure fire way to have both security and functional problems.

When dealing with user data, always treat it as untrusted. If you don't address this early in the product and your scenarios change, it is almost impossible to go back and find all of the XSS points or modifythe product to prevent XSS without upsetting your users.

Chris Clark