views:

441

answers:

6

Are there are production quality nosql stores that I can use on a production system. I have looked at cassandra, tokyodb, couchdb etc but none of them seem to be ready for deployments on production like environments. I am talking thousands of requests per minute and lots of reads/writes/updates. My only concern is speed and service times. Does anybody know of production systems that use nosql stores effectively ? Does anybody know of a nosql store that is backed by a big enterprise like Google/Yahoo/ IBM ?

A: 

Umm, memcached?

Artem Russakovskii
Its not caching I am looking for. Its a datastore where I can store records instead of storing it in a database.
Ritesh M Nayak
+1  A: 

BerkeleyDB is backed by Oracle

Using the native C interface one can reach close to 1 million read requests per second.

By the way, when you say thousands requests per minute, any 'normal' DB should be able to handle that easily too.

Toad
It is an embedded DB though, I guess he was looking for a network-accessible solution.
Thilo
You can always use it in combination with a protocol like memcachedb (so memcache protocol but backed by berkeleydb)
Toad
Yes, I am looking at some network-accesible solution. Something that everybody on a team can connect to and work. Embedded dbs are fine as long as its on a dev system.
Ritesh M Nayak
ritesh: if you use memcachedb, then it is berkeley but accessible via the memcache protocol (so useable in a network)
Toad
+4  A: 

I think the NoSQL systems are an excellent choice if I you 'only' care about speed and service time (and not or less about stuff like consistency and transactions). Facebook uses Cassandra.

"Cassandra is used in Facebook as an email search system containing 25TB and over 100m mailboxes." http://highscalability.com/product-facebooks-cassandra-massive-distributed-store

I think CouchDb isn't really speedy, maybe you can use MongoDB: http://www.mongodb.org/display/DOCS/Production+Deployments

Theo
“Facebook uses Cassandra.” Not to overstate the point, but that seems to soundly demolish the “one of them seem to be ready for deployments on production like environments” supposition.
Paul D. Waite
+4  A: 

Cassandra handles thousands of requests (including write-mostly workloads) per second, per machine, and its scaling-by-adding-machines has been there since day 1.

Here is a thread about Cassandra use in production and in-production-soon at dozens of companies: http://n2.nabble.com/Cassandra-users-survey-td4040068.html#a4040068

We're also adding more docs all the time, like http://wiki.apache.org/cassandra/Operations.

jbellis
The references are fantastic. Looks like there's a huge community backing Cassandra. I also love its distributed scaling feature. Cassandra it is !!
Ritesh M Nayak
+2  A: 

Also worth consideration is using a traditional RDBMS like MySQL to store schema-less. This method gives you the stability of a proven database server like MySQL with the flexibility a NoSQL solution.

Check out this blog posting on how FriendFeed does this.

jamesaharvey
I totally agree. Especially given that you can manipulate the data (complicated stuff) and fields retrieved using SQL. You don't have to rely on a simple get(key) to retrieve.
Ritesh M Nayak
A: 

Redis is worth giving a try as Github uses redis to manage a heavy queue of background jobs.

ardsrk