views:

144

answers:

5

I have a LAMP server with 256MB RAM (poor man's server in cloud). I have an app written to run on this machine. Currently people upload images and they go straight into mysql as BLOB.

There are concerns that this might be very memory consuming operation and we move over it to simple plain files. Can some one tell me if these concerns are valid? (Worth putting efforts into changing a lot of ode that's already written given that we will have sufficient RAM in next 6 months ?)

As a general rule when should we store images in DB and when as files?

+7  A: 

To read a BLOB in MySQL you need three times as much memory as it takes (it gets copied into several buffers).

So yes, reading a BLOB in MySQL consumes more memory than reading a file.

Quassnoi
Why the downvote?
Quassnoi
I got one too without any comment.. Someone seems to be doing strategical down-voting.
Andrew Moore
You said "three time". Not that I am doubting you but I will be very happy if you can substantiate that. :)
Akshar Prabhu Desai
MySQL's query buffer, while the data gets sent over the wire from mysql->script, PHP's own memory while the data's being echoed out and possibly copied around internally from database object->userland, and the webserver's own output buffer
Marc B
@Marc: no, it's three buffers only within `MySQL` :)
Quassnoi
@Akshar: see the link in my post.
Quassnoi
A: 

If you have a lot of files to keep track of, or they are very large, I'd store them as files. Especially if these files are to be accessed via the web, in which case you can offload all that effort from the SQL server and let the web server handle the transfer.

A good way to track images is to name them using the primary key, and then keep track of the original file name (if you need it) in the database. This way you can always know which file connects to which row. Also, if you have many files (thousands, millions,...), you might consider 'hashing' them into directories, so that 1-1000 are stored in /1, 1001-2000 ares stored in /2, etc. Some OS's see a bit of slowdown when you get a large number of files in a single directory.

GrandmasterB
+1  A: 

If it is for serving on a WebPage I would user plain file system with a link to the image file name in text format on DataBase. Apache and browsers usually do an amazing job on caching static files.

Even though in theory, you could achieve similar performance serving images from a database, the amount of work you need to do for it does not justify this selection given that the only advantage I can think of is a more cohesive database (with a simple DB dump you get ALL your data: images + data).

Pablo Santa Cruz
Caching is seriously one important thing that dint come to my mind. But does apache ever knows if the file from from filesystem or db? http://example.org/file.jpg might well come from db.
Akshar Prabhu Desai
@Akshar: of course it knows. `example.org/file.jpg` would first go though `mod_rewrite` which would map it to `example.org/serve-db-file.php` or something like this. This all is within `Apache` of course.
Quassnoi
+6  A: 

You should store them in the file system for several reasons:

  • The images are easily accessible via other apps (shell, FTP, www, etc...),
  • It's less resource intensive (including memory) to read them from the file system than from a database
  • If the database gets corrupted, the images are safe.
  • You also won't have tables bump up against their size limitations (determined by OS file size limitations) which slows them down (and making them require more resources to read).

The only time you should consider storing images in the DB is when they are used in transaction processing, and even then, there are numerous workarounds to that when storing in the file system.

webbiedave
+1 for #3, that's an oft-overlooked bigass benefit
Andrew Heath
+4  A: 

To summarize:

Database Storage:

Pros:

  • Assures referential integrity
  • Easier backup strategy
  • Easier clustering (database cluster)

Cons:

  • Higher cost in memory usage and storage
  • Hard to scale
  • Additional code must be written to support HTTP caching
  • Requires a database and associated querying code

File System Storage

Pros:

  • Low memory footprint (more efficient)
  • Storage equal to file size
  • Easy retrieval and storage
  • Allows the web server to control caching

Cons:

  • Referential integrity not assured
  • Backups are not always in sync with database backups
  • Requires additional backup strategy

If referential integrity of your images is important, store them in the database. The advantages is that a backup of your database will always means that your rows and images are in sync. It does mean though that it is a bit more costly resource wise to store and retrieve.

If the images themselves are not that important, store them as files. It allows for fast and simple retrieval and storage. The downside of using files though is that your backup strategy becomes more complicated and your files will not always be in sync with your database rows.

I personally always store them in the database. For me, the rewards are greater then the cost. This is hardly always the case though and you should look at your application requirements to see which is best for you.

Some large websites are using BLOBs to store their website content. Flickr's use of BLOBs is actually well documented. To answer your question though, file storage is more memory efficient than database storage.

Andrew Moore
Always nice to know why someone disagrees. My post is pretty neutral and states the advantages and disadvantages of both... So why a downvote?
Andrew Moore
I did not downvote. However, the OP is most likely using MyISAM so referential integrity is not ensured in that storage type.
webbiedave