So, you're writing a web application and you have several areas of the site where the user can upload files. My basic working method for this is to store the actual file on the server, and have a database table that connects the stored filename to the record it relates to.

My question is this: Should there be a different table for each "type" of file? Also, should the files be stored in context-related locations on the server, or all together?

Some examples: user profile photos, job application CVs, related documents on CMS pages, etc.

+2  A: 

Do you intend to treat them differently or are they all just files for your application?

Lasse V. Karlsen

Do you intend to treat them differently or are they all just files for your application?

Well, that is the question - what level of abstraction makes sense? At the lowest level, files are just files right?

To extend on my example earlier - the Users table would have a photo column, that could either have the path to the file on disk - or it could just be a file_id from a central Files table. Should the actual way they are treated at the higher level of the application affect the lower level storage methods - does it make it more extensible?

+1  A: 

From what you've said I would just store files with random (UUID or what-not) filenames in one place. I would then have a 'attachments' table or something that contains references to all your external files. This table would also contain the meta-data for that file, so what type of file it is (picture, CV etc) and so on.

There may be hard limits to the number of files in one directory though, depending on what FS you are using.

+1  A: 

There might be various reasons for storing different files in different locations.

Firstly, a restriction on the number of files in one directory might be a consideration.

Secondly security might be an issue - if some are to be publicly viewable (such as profile photos for example) but others are not (such as CVs) then placing them in different directories would be easier to manage.

Thirdly, simple admin tasks may be easier if files are split, browsing in a file explorer for example, or managing backups, or modifying the application to split file storage across multiple locations.

There is also the issue of filename conflicts, but if you rename everything to match the database id field (for example) then this wouldn't be an issue.

But at the end of the day it probably depends on volumes and your own preference.

+1  A: 

A different table for each file type only becomes relevant if you store other metadata (and therefore, additional columns) for each type of file. If your tables for each file type only contain the same columns (e.g., filename, filetype, dateuploaded, etc) then it would make sense to have them all on one table.

Jon Limjap
+1  A: 

From your example, there is an argument for two tables, as you have files that can be associated with two different things.

  • CVs, photos are associated with a user.
  • attachments are associated with a CMS page.

If you put these in one table, (and you want to allow users to have more than one photo or cv) then you need two link-tables to associate files->users and files->cms_pages. Arguably this implies a HABTM relationship, which is not correct and allows for inconsistent data.

The two table approach is slightly cleaner and only allows files to be associated with the correct type of entity with a simple belongsTo relationship.

But I don't think there is any "right" answer to this qeustion, unless you need to store different types of metadata for different filetypes.

Also be sure to store, or be able to calculate, the mimetype for each file so it can be served correctly back to the browser, with the correct HTTP headers.