tags:

views:

329

answers:

5

I am going to add file upload control to my ASP.NET 2.0 web page so that users can upload files.Files will be stored in the server in the folder with the name as of the user. I want to know what is the best option to name the files when saving to server. Needs to consider Security,Performance,Flexibility to handle files etc... .If i miss anything, please guide me

Options I am considering now :

  1. Upload with the same name as of the input file name
  2. Add User Id+Random Number +File name as of the input file name
  3. Create random numbers +Current Time in seconds and save files with that number. Will have one table to map this number with users upload

Any thing else? What is the best way?

Thanks in advance

+4  A: 

NEVER EVER use user input for filenames. Don't use the username. User the user id instead (I assume your users have an unique id).

NEVER use the original filename. Use your solution number 3, plus the user id instead of the username.

For your information, PHP had a vulnerability a few years ago: one could forge a HTTP POST request with a file upload, and with a file name like "../../anything.php", and the php _FILES array, supposed to contain sanitized values, didn't detect these kind of file names, so one could write files anywhere in the filesystem.

FWH
Could you explain why?
Treb
Because I can upload some php code, and if you save it on your server with .php extension then it will execute when I access it.There are a lot of examples, it is very hard to think of all the ways to gain access to the server when file upload is allowed with arbitrary names, which is why it's much better NOT to allow arbitrary names in the first place.
FWH
Right, thanks!
Treb
A: 

I would go with option #3. A table mapping these files with users will provide other uses down the road, it always does. If you use the mapping, the only advantage of appending the user name or id to the file is if you are trying to debug a problem.

I'd probably use a GUID instead of a random number but either would work. The important things in my opinion are

  1. No username as part of the filename as any part of the stored file
  2. Never use the original file name as any part of the stored file
  3. Use a random number or GUID to ensure no duplicate file
  4. Adding an user id to the file will help with manual debugging issues
Cody C
+1  A: 

There is more to this than meets the eye...which I am thinking that you already knew!

What sort of files are you talking about? If they are anything even remotely big or in such quantity that the group of files could be big I would immediately suggest that you add some flexibility to your approach.

  1. create a table that stores the root paths to various file stores (this could be drives, unc paths, what ever your environment supports). It will initially have one entry in it which will be your first storage location. An nice attribute to maintain with this data is how much room can be stored here.
  2. maintain a table of file related data (id {guid}, create date, foreign key to path data, file size)
  3. write the file to a root that still has room on it (query all file sizes stored in a root location and compare to that roots capacity)
  4. write the file using a GUID for the name (obfuscates the file on the file system)..can be written without the file extension if security requires it (sensitive files)
  5. write the file according to its create date starting from the root/year{number}/month{number}/day{number}/file.extension

With a system of this nature in place - even though you won't/don't need it up front - you can now more easily relocate the files. You can better manage the files. You can better manage collections of files. Etc. I have used this system before and found it to be quite flexible. Dealing with files that are stored to a file system but managed from a database can get a bit out of control once the file store becomes so large and things need to get moved around a bit. Also, at least in the case of windows...storing zillions of files in one directory is usually not a good idea (the reason for breaking things up by their create date).

This complexity is only really needed when you have high volumes and large foot prints.

Andrew Siemer
+1  A: 

I'd use a combination of

  • User ID
  • A random generated string (e.g. a GUID)

Example PDF file name: 23212-dd503cf8-a548-4584-a0a3-39dc8be618df.pdf

This way, the user can upload as many files as he/she wants, without file name conflict, and you are also able to point out which files belong to which users, just by looking at the file names.

I don't see the need to include any other information in the file name, since upload time/date and such can be retrieved from the file's attributes.

Also, you should store the files in a safe location, which external users, such as visitors of your website, cannot access. Instead, you deliver the file to them through a proxy web page (you read the file from the safe location, and pass the data on to the user). For this solution, a database is needed to keep track of files, their location, etc.

This also makes you able to control which users have access to which files through your code.


Update: Here's a description of how the solution with the proxy web page could be implemented.

  1. Create a Web Form with the name GetFile.aspx
  2. GetFile.aspx takes one query parameter named fileid, which is used to identify the file to get. E.g.:
    http://www.mypage.com/GetFile.aspx?fileid=100
  3. Use the fileid parameter to lookup the file location in the database, so that it can be read and sent to the user. In the Web Form you use Request.QueryString("fileid") to get the file ID and use it in a query that will look something like this (SQL):
    SELECT FileLocation FROM UserFiles WHERE FileID = 100
  4. Read the file using a System.IO.FileStream and output its contents through Response.Write. Remember to set the appropriate content type using Response.ContentType first, so that the client browser handles the requested file correctly (see this post on asp.forums.net and the MDSN article which is also referred to in the post, which both discuss a method of determining the appropriate content type automatically).

If you choose this approach, it's easy to implement your own simple security or custom actions later on, such as making sure a user is logged into your web site before you send the file, or that users can only access files they uploaded themselves, or logging which users download which files, etc. The possibilities are endless ;-)

Bernhof
No. Don't keep the extension, or I'll upload PHP files to your server, for example, and you certainly don't want me to do that.
FWH
IF i use string guidResult = System.Guid.NewGuid().ToString(); to creare a GUID in C#,IS it sure that it will not produce duplicate items ?
Shyju
Shyju - as sure as you'll get :) .. The risk of two exact same GUIDs being generated on the same computer (and even for the same user in this case) is to small, that it's considered impossible. I wouldn't worry about it.
Bernhof
FWH - you're right, I forgot to mention that in this case, the files should be stored in a 'safe' location, meaning a location that is not accessible to external users. I've updated the answer. An alternative solution would be to just add a 'fake' extension, such as .file - in this case: 23212-dd503cf8-a548-4584-a0a3-39dc8be618df.pdf.file. This would prevent any file from executing.
Bernhof
Bernahof,Could you elaborate "deliver the file to them through a proxy web page " this ?
Shyju
Hi Shyju, I have updated the post to give you the general idea of how this is implemented. I would like to post code examples as well, but I currently don't have the time. Please let me know, if you would like me to describe it further. Good luck!
Bernhof
A: 

Take a look at the System.IO.Path class as it has lots of useful functions you can utilise, such as:

Check which characters are invalid in a file name:

System.IO.Path.GetInvalidPathChars();

Get a random file name:

System.IO.Path.GetRandomFileName();

Get a unique, randome filename in the temporary directory

System.IO.Path.GetTempFileName();
Dan Diplo