views:

115

answers:

2

I have seen quite a few code samples/plugins that promote uploading assets directly to S3. For example, if you have a user object with an avatar, the file upload field would load directly to S3.

The only way I see this being possible is if the user object is already created in the database and your S3 bucket + path is something like

user_avatars.domain.com/some/id/partition/medium.jpg

But then if you had an image tag that tried to access that URL when an avatar was not uploaded, it would yield a bad result. How would you handle checking for existence?

Also, it seems like this would not work well for most has many associations. For example, if a user had many songs/mp3s, where would you store those and how would you access them.

Also, your validations will be shot.

I am having trouble thinking of situations where direct upload to S3 (or any cloud) is a good idea and was hoping people could clarify either proper use cases, or tell me why my logic is incorrect.

+2  A: 

Why pay for storage/bandwidth/backups/etc. when you can have somebody in the cloud handle it for you?

S3 (and other Cloud-based storage options) handle all the headaches for you. You get all the storage you need, a good distribution network (almost definitely better than you'd have on your own unless you're paying for a premium CDN), and backups.

Allowing users to upload directly to S3 takes even more of the bandwidth load off of you. I can see the tracking concerns, but S3 makes it pretty easy to handle that situation. If you look at the direct upload methods, you'll see that you can force a redirect on a successful upload.

Amazon will then pass the following to the redirect handler: bucket, key, etag

That should give you what you need to track the uploaded asset after success. Direct uploads give you the best of both worlds. You get your tracking information and it unloads your bandwidth.

Check this link for details: Amazon S3: Browser-Based Uploads using POST

Justin Niessner
This doesn't really answer the question though. I totally understand why I would want to store stuff on S3. I do it right now. What I don't understand is how you would upload directly to S3 without using your server as a conduit for validation, and tracking the asset you uploaded in a database.
Tony
That's what I was looking for. I guess the only thing you lose here is any post-processing of images unless you download, post-process, and then upload the altered images.
Tony
@Tony - Glad to help. As for post-processing, that might be a use-case where you want to proxy the upload. Having the user directly upload to S3, you then having to download and re-upload seems a bit of a waste.
Justin Niessner
@Justin I agree, thanks again for the link. I should have looked at the docs more carefully but surprisingly none of the 50 file upload articles I have read mention anything about it.
Tony
+1  A: 

If you are hosting your Rails application on Heroku, the reason could very well be that Heroku doesn't allow file-uploads larger than 4MB:
http://docs.heroku.com/s3#direct-upload

So if you would like your users to be able to upload large files, this is the only way forward.

Thomas Watson
Yea I recently heard about this....so then how would you process an image larger than 4MB with an app on Heroku. Seems like a pretty big limitation and I'm guessing there's a solution
Tony
You first upload direct to S3. Then, via AJAX, you tell your app about the newly uploaded file. Your app can then access the file on S3 to do post processing (e.g. thumbnailing) and register it in the database (e.g. in Paperclip or Attachment_fu). Code examples: http://gist.github.com/575842 or http://tnux.net/2010/01/17/swfupload-direct-to-amazon-s3-in-ruby-on-rails/
Thomas Watson