views:

73

answers:

3

I'm working on a Rails app that accepts file uploads and where users can modify these files later. For example, they can change the text file contents or perform basic manipulations on images such as resizing, cropping, rotating etc.

At the moment the files are stored on the same server where Apache is running with Passenger to serve all application requests.

I need to move user files to dedicated server to distribute the load on my setup. At the moment our users upload around 10GB of files in a week, which is not huge amount but eventually it adds up.

And so i'm going through a different options on how to implement the communication between application server(s) and a file server. I'd like to start out with a simple and fool-proof solution. If it scales well later across multiple file servers, i'd be more than happy.

Here are some different options i've been investigating:

  • Amazon S3. I find it a bit difficult to implement for my application. It adds complexity of "uploading" the uploaded file again (possibly multiple times later), please mind that users can modify files and images with my app. Other than that, it would be nice "set it and forget it" solution.
  • Some sort of simple RPC server that lives on file server and transparently manages files when looking from the application server side. I haven't been able to find any standard and well tested tools here yet so this is a bit more theorethical in my mind. However, the Bert and Ernie built and used in GitHub seem interesting but maybe too complex just to start out.
  • MogileFS also seems interesting. Haven't seen it in use (but that's my problem :).

So i'm looking for different (and possibly standards-based) approaches how file servers for web applications are implemented and how they have been working in the wild.

A: 

I think S3 is your best bet. With a plugin like Paperclip it's really very easy to add to a Rails application, and not having to worry about scaling it will save on headaches.

Michael Melanson
+1  A: 

you could also try and compile a version of Dropbox (they provide the source) and ln -s that to your public/system directory so paperclip saves to it. this way you can access the files remotely from any desktop as well... I haven't done this yet so i can't attest to how easy/hard/valuable it is but it's on my teux deux list... :)

BandsOnABudget
+1  A: 

Use S3. It is inexpensive, a-la-carte, and if people start downloading their files, your server won't have to get stressed because your download pages can point directly to the S3 URL of the uploaded file.

"Pedro" has a nice sample application that works with S3 at github.com.

  1. Clone the application ( git clone git://github.com/pedro/paperclip-on-heroku.git )
  2. Make sure that you have the right_aws gem installed.
  3. Put your Amazon S3 credentials (API & secret) into config/s3.yml
  4. Install the Firefox S3 plugin (http://www.s3fox.net/)
  5. Go into Firefox S3 plugin and put in your api & secret.
  6. Use the S3 plugin to create a bucket with a unique name, perhaps 'your-paperclip-demo'.
  7. Edit app/models/user.rb, and put your bucket name on the second last line (:bucket => 'your-paperclip-demo').
  8. Fire up your server locally and upload some files to your local app. You'll see from the S3 plugin that the file was uploaded to Amazon S3 in your new bucket.

I'm usually terribly incompetent or unlucky at getting these kinds of things working, but with Pedro's little S3 upload application I was successful. Good luck.

Jay Godse