views:

142

answers:

3

I"m looking to run PostgreSQL in RAM for performance enhancement. The database isn't more than 1GB and shouldn't ever grow to more than 5GB. Is it worth doing? Are there any benchmarks out there? Is it buggy?

My second major concern is: How easy is it to back things up when it's running purely in RAM. Is this just like using RAM as tier 1 HD, or is it much more complicated?

+1  A: 

Actually... as long as you have enough memory available your database will already be fully running in RAM. Your filesystem will completely buffer all the data so it won't make much of a difference.

But... there is ofcourse always a bit of overhead so you can still try and run it all from a ramdrive.

As for the backups, that's just like any other database. You could use the normal Postgres dump utilities to backup the system. Or even better, let it replicate to another server as a backup.

WoLpH
A database on disk, even if all the pages are in memory, needs to synchronise writes to the log file in order to maintain ACID properties. So an in-RAM database would actually be quite a bit faster because it wouldn't have to do that. Of course, that doesn't apply if the database is mostly read-only.
Dean Harding
@codeka In my case I'll be doing almost as many writes (with loads of transactions) as I'll be doing reads.
orokusaki
@Dean Harding: with most database servers that is configurable. Writes don't _have_ to be in sync. With Postgres for example you can configure completely when to write (interval, bytes, etc..).
WoLpH
+4  A: 

The whole thing about whether to hold you database in memory depends on size and performance as well how robust you want it to be with writes. I assume you are writing to your database and that you want to persist the data in case of failure.

Personally, I would not worry about this optimization until I ran into performance issues. It just seems risky to me.

If you are doing a lot of reads and very few writes a cache might serve your purpose, Many ORMs come with one or more caching mechanisms.

From a performance point of view, clustering across a network to another DBMS that does all the disk writing, seems a lot more inefficient than just having a regular DBMS and having it tuned to keep as much as possible in RAM as you want.

Romain Hippeau
+3  A: 

It might be worth it if your database is I/O bound. If it's CPU-bound, a RAM drive will make no difference.

But first things first, you should make sure that your database is properly tuned, you can get huge performance gains that way without losing any guarantees. Even a RAM-based database will perform badly if it's not properly tuned. See PostgreSQL wiki on this, mainly shared_buffers, effective_cache_size, checkpoint_*, default_statistics_target

Second, if you want to avoid synchronizing disk buffers on every commit (like codeka explained in his comment), disable the synchronous_commit configuration option. When your machine loses power, this will lose some latest transactions, but your database will still be 100% consistent. In this mode, RAM will be used to buffer all writes, including writes to the transaction log. So with very rare checkpoints, large shared_buffers and wal_buffers, it can actually approach speeds close to those of a RAM-drive.

Also hardware can make a huge difference. 15000 RPM disks can, in practice, be 3x as fast as cheap drives for database workloads. RAID controllers with battery-backed cache also make a significant difference.

If that's still not enough, then it may make sense to consider turning to volatile storage.

intgr
@intgr Thanks for the great info.
orokusaki