views:

725

answers:

3

Does anyone have any suggestions on a persistent, distributed and replicated key-value blob store?

I've seen this list of key value stores. But they all seem to be for storing small blob values. The blobs I am dealing with range from 2K to 300M in size.

What I need is something more like to mogilefs, but mogilefs is optimized for Write Once/Read Many whereas I am looking or Write Many/Ready Many. I would like them replicated over multiple servers and preferably multiple sites - say from the primary data center in Texas to a backup data center in Oregon, for instance.

Note 1: I have no need to index the data inside of the blobs or create different views in anyway... Note 2: The execs really like to own the hardware.

Any suggestions?

A: 

Check out amazon aws s3, sounds like it matches your description perfectly

Have you looked at Apache CouchDb?

Max
Sorry, I should have added that the execs really like to own the hardware - otherwise, I'd say you are right...
consultutah
A: 

If you want a local key/value store, you want Berkeley DBs. They're on-disk databases that support non-relational data, are accessible from many languages, support transactions and have excellent performance characteristics. They're used by applications ranging from the web's top-tier sites down to mobile-phone embedded apps. If you want a distributed, replicated version, look into BDB-HA (high-availability). The HA version has improved significantly over time and dealt with many of the major failure conditions in distributed environments.

If you don't like the sound of BDB or BDB-HA, consider that InnoDB (with the obligatory but in this case useless MySQL frontent) also likely meets your conditions -- it provides a basic storage engine with replication, disk persistence and good performance characteristics if the working set is in cache. You can remove some of the bottlenecks imposed in a basic MySQL/Innodb install by reducing the transaction isolation, reducing the disk flush interval, and allowing non-synchronous writes. By doing so, you bring its raw performance inline with BDBs or any other key/value stores while retaining the ability to use ad-hoc SQL queries to administer and monitor your data set.

Brad B
+1  A: 

You might also try MongoDB:

  • Document-oriented storage
  • Efficient storage of binary data large objects (e.g. photos and videos)
  • Replication and fail-over support
  • Auto-sharding for cloud-level scalability

http://www.mongodb.org/display/DOCS/GridFS

Silas