views:

3221

answers:

5

I'm going to be starting a project soon that requires support for large-ish binary files. I'd like to use Ruby on Rails for the webapp, but I'm concerned with the BLOB support. In my experience with other languages, frameworks, and databases, BLOBs are often overlooked and thus have poor, difficult, and/or buggy functionality.

Does RoR spport BLOBs adequately? Are there any gotchas that creep up once you're already committed to Rails?

BTW: I want to be using PostgreSQL and/or MySQL as the backend database. Obviously, BLOB support in the underlying database is important. For the moment, I want to avoid focusing on the DB's BLOB capabilities; I'm more interested in how Rails itself reacts. Ideally, Rails should be hiding the details of the database from me, and so I should be able to switch from one to the other. If this is not the case (ie: there's some problem with using Rails with a particular DB) then please do mention it.

UPDATE: Also, I'm not just talking about ActiveRecord here. I'll need to handle binary files on the HTTP side (file upload effectively). That means getting access to the appropriate HTTP headers and streams via Rails. I've updated the question title and description to reflect this.

+2  A: 

I think your best bet is the attachment_fu plug-in: http://github.com/technoweenie/attachment_fu/tree/master

UPDATE: Found some more info here http://groups.google.com/group/rubyonrails-talk/browse_thread/thread/a81beffb93708bb3

Teflon Ted
+2  A: 

You can use the :binary type in your ActiveRecord migration and also constrain the maximum size:

class BlobTest < ActiveRecord::Migration
  def self.up
    create_table :files do |t|
      t.column :file_data, :binary, :limit => 1.megabyte
    end
  end
end

ActiveRecord exposes the BLOB (or CLOB) contents as a Ruby String.

John Topley
+4  A: 

+1 for attachment_fu

I use attachment_fu in one of my apps and MUST store files in the DB (for annoying reasons which are outside the scope of this convo).

The (one?) tricky thing dealing w/BLOB's I've found is that you need a separate code path to send the data to the user -- you can't simply in-line a path on the filesystem like you would if it was a plain-Jane file.

e.g. if you're storing avatar information, you can't simply do:

<%= image_tag @youruser.avatar.path %>

you have to write some wrapper logic and use send_data, e.g. (below is JUST an example w/attachment_fu, in practice you'd need to DRY this up)

send_data(@youruser.avatar.current_data, :type => @youruser.avatar.content_type, :filename => @youruser.avatar.filename, :disposition => 'inline' )

Unfortunately, as far as I know attachment_fu (I don't have the latest version) does not do clever wrapping for you -- you've gotta write it yourself.

P.S. Seeing your question edit - Attachment_fu handles all that annoying stuff that you mention -- about needing to know file paths and all that crap -- EXCEPT the one little issue when storing in the DB. Give it a try; it's the standard for rails apps. IF you insist on re-inventing the wheel, the source code for attachment_fu should document most of the gotchas, too!

Matt Rogish
A: 

Look into the plugin, x_send_file too.

"The XSendFile plugin provides a simple interface for sending files via the X-Sendfile HTTP header. This enables your web server to serve the file directly from disk, instead of streaming it through your Rails process. This is faster and saves a lot of memory if you‘re using Mongrel. Not every web server supports this header. YMMV."

I'm not sure if it's usable with Blobs, it may just be for files on the file system. But you probably need something that doesn't tie up the web server streaming large chunks of data.

+3  A: 

As for streaming, you can do it all in an (at least memory-) efficient way. On the upload side, file parameters in forms are abstracted as IO objects that you can read from; on the download side, look in to the form of render :text => that takes a Proc argument:

render :content_type => 'application/octet-stream', :text => Proc.new {
    |response, output|
    # do something that reads data and writes it to output
}

If your stuff is in files on disk, though, the aforementioned solutions will certainly work better.

Jim Puls