views:

149

answers:

3

Hi,

I am doing file-upload for a web-application (running on unix/linux). I'm wondering if there would be a concern if I planned to create a new directory for each file upload? This is the out-of-the-box approach for the Ruby on Rails plugin "paperclip". I debating what the trade-offs are, or whether perhaps it's just not a concern, if deploying on a linux/unix environment.

The options would seem to be:

  1. One folder per file attachment - per how paperclip seems to work out of the box
  2. One folder per user perhaps (i.e. if web service has multiple users with their own account) - and then one would need to add some uniqueness to the filename (perhaps the model ID)
  3. Put all attachments in one folder - but this is probably going too far the other way

Question - Should I be concerned about the number of directories being created? Is this an issue for an O/S if the service was popular? Any advice for a website that was allowing users with their own separate account to upload files, what structure might be good with respect to storing them? (I guess I've discounted the concept of storing files in mysql.)

Thanks

A: 

If you have a separate partition for the directory where the new files/directories get created, I'd say it's not a problem. It can get a problem if you just use another partition since you can run out of inodes and/or free disk space which can be bad.

Using a separate partition would (in case of a DOS attack) only stop your application from working correctly and the system won't get hurt in any way.

Johannes Weiß
A: 

Not as such, but having gazillions of folders in one directory (or the same for files) isn't recommended (it's a real hit to speed).

Reason: c-style strings

A good solution would be to hierchially (sic?) store things something like: /path/to/usernamefirstletter/username/year/month/file

Aviral Dasgupta
Also : never use these "real" file paths for users... use server side redirection, for friendlier urls.
Aviral Dasgupta
Actually the reason people have traditionally avoided very large directories is because the directory was a flat list with no indexing, it has nothing to do with the strings being nul-terminated.And it's increasingly a non-issue on modern filesystems, almost all of which do some form of directory indexing. All modern linux distributions I'm aware of use indexed directories by default, as does Solaris/ZFS. Not sure about the BSDs.
Andy Ross
+2  A: 

Assuming Ext3 formatted drive under Linux (the most common).

From (http://en.wikipedia.org/wiki/Ext3)

"There is a limit of 31998 sub-directories per one directory, stemming from its limit of 32000 links per inode.[13]"

So, if you'll hit the limit of 32k uploads, which isn't that high, your application will fail.

Donblas