views:

56

answers:

3

I am currently using the Zend Framework and have an upload file form. An authenticated user has the ability to upload a file, which will be stored in a directory in the application, and the location stored in the database. That way it can be displayed as a file that can be downloaded.

<a href="/upload-location/filename.pdf">Download</a>

But something I am noticing is that a file with the same name will overwrite a file in the uploads directory. There is no error message, nor does the filename increment. So I think the file must be overwritten (or never uploaded).

What are some best practices I should be aware of when uploading, moving, or storing these files? Should I always be renaming the files so that the filename is always unique?

+7  A: 

Generally, we don't store files with the name given by the user, but using a name that we (i.e. our application) chosse.

For instance, if a user uploads my_file.pdf, we would :

  • store a line in the DB, containing :
    • id ; an autoincrement, the primary key -- "123", for instance
    • the name given by the user ; so we can send the right name when someone tries to download the file
    • the content-type of the file ; application/pdf or something like that, for instance.
    • "our" name : file-123 for instance
  • when there is a request to the file with id=123, we know which physical file should be fetched ('file-' . $id) and sent.
  • and we can set some header to send to correct "logical" name to the browser, using the name we stored in the DB, for the "save as" dialog box
  • same for the content-type, btw

This way, we make sure :

  • that no file has any "wrong" name, as we are the ones choosing it, and not the client
  • that there is no overwritting : as our filenames include the primary key of our table, those file names are unique
Pascal MARTIN
great answer, thanks!
Andrew
@Andrew : you're welcome :-) Have fun !
Pascal MARTIN
A: 

Yes you need to come up with a way to name them uniquely. Ive seen all kinds of different strategies for this ranging from a hash base on the orignal filename, pk of the db record and upload timestamp, to some type of slugging, again based on varous fields in the db record its attached to or related records.

prodigitalson
+1  A: 

Continuing on Pascal MARTIN's answer:

If using an id as name you can also come up with a directory naming strategy. I takes no longer to get /somedir/part1ofID/part2OfID from the filesystem than /somedir/theWholeID but it will let you choose how many files are stored in the same directory from how you split the ID to form the path and file name.

The next good thing is that the script that you use to actually output the file to the user can choose if the user is authorized to see the file or not. This of course requires the files to be stored somewhere not readable by everyone by default.

You may also want to look at this other question. Not totally related, but good to be aware of.

Per Wiklander
If you're going to "continue" or add to somebody's answer, use a comment on their answer, not post it as an answer. Mostly because if there are lots of answers, your post will not be next to the one you're adding to.
TravisO
Yeah, that was my idea at first as well. I just feel the comment box is a bit limited for adding longish answers. What do you suggest I do, remove my answer and add it as a comment instead, or add more context to my answer?
Per Wiklander