tags:

views:

49

answers:

3

I am working on a kinda document management system. The end users are business class users.

I currently check and allow files to be uploaded only if they are one of the followings:

"png|jpe?g|gif|xls|doc|docx|csv|ppt|txt|pdf|rtf"

my questions are -

  • If I add "xml" into the list? can they cause any security issue?

  • What other document types/extensions I can add into this?

Or

Should I check NOT IN "exe|bat|php|js" and allow all other types?

Thanks for suggestions.

+1  A: 

Don't forget the rest of the new MS Office formats, xlsx, pptx and similar files.

As for security risk, it depends on where/how these files will be accessed. From a document management standpoint, XML wouldn't be something that I would assume by default, but I could see it needed.

I wouldn't do a wildcard exclude list, unless you are protecting the serving of the content.

Mitchel Sellers
+1  A: 

Add the new MS Office doc extensions and the open office docs and you probably are good to go. Perhaps adding Zip/Rar archives is an option too but you should consider restricting direct access to the files then.

Perhaps adding a clear message and link that new document types can de added upon request might help too?

ChrisR
+3  A: 

Should I check NOT IN "exe|bat|php|js" and allow all other types?

No. Whitelisting is better than blacklisting. There are many, many more dangerous filetypes than you are likely to know about. Those four barely scratch the surface.

If I add "xml" into the list? can they cause any security issue?

Yes it can, [X][HT]ML may contain scripting which operates in the security context of the site that served it. Which allows anyone who can upload documents to your site to inject JavaScript into your site (stealing cookies, request forgery etc).

However... it doesn't actually add any security issue you don't already have. Because even whitelisting by filetype/extension is not secure, thanks to IE and its misbegotten type-sniffing. You can upload a .txt file, and serve it correctly using the Content-Type: text/plain header, but if it contains sequences that IE thinks look like HTML, IE will ignore you and render it as HTML — boom, XSS.

(The same is true of any other type really, but .txt is the most openly vulnerable.)

There are two approaches to fixing this mess:

  1. serve all user-uploaded files from a different hostname to the main application site, so that they are in different JS security contexts and do not share cookies or authentication data.

  2. serve all user-uploaded files with the Content-Disposition: attachment header so that they will always be downloaded and not displayed within the browser.

(2) on its own should be watertight, but in practice in the past there have been ways around it due to browser and plugin exploits, so I'm not sure I'd completely trust in it. (1) on its own stops XSS, but it doesn't stop other nasties like HTML files containing iframes to exploit sites.

So it's best to do both.

bobince
Ah .. good points. Thanks.
Wbdvlpr