views:

28

answers:

1

I am developing a service that needs to perform bulk insert of tens of thousands of rows/items/objects at a time, about 20 times per second.

(The NoSQL storage can be sharded, so inserts can work in parallel. The sharding strategy, and sharding in general, do not matter for this discussion.)

The question is: which NoSQL products in your opinion exhibit the best performance under such circumstances? The answer should include all costs, including serialization and overhead of chatty/laconic protocols.

There is no requirement for the storage to be persistent.

Thank you!

A: 

I'm pretty sure you want MongoDB (use 1.5.x version if you want sharding - disregard warnings that it's not production-ready).

taw
Hi Taw, thanks for answering. Can you please explain why?
Teleo
My benchmarks say mongodb is a lot faster than anything else and shards well, ymmv ;-) The underlying reason is that it's written with efficiency in mind - on real modern hardware. There are no unnecessary abstraction layers like SQL/JDBC/whatever. There is no ACID - just per-document atomicity which is good enough for many cases. And most importantly - it doesn't try to do operating system's job like all other dbs - it `mmap`s everything and lets hardware MMU and OS do memory management. [This alone tends to give you 3x+ speedup for free](http://varnish-cache.org/wiki/ArchitectNotes).
taw