views:

226

answers:

4

For a few projects I'm working on I need a persistent key value store (something akin to memcache). It would ideally run as a server; it needs to be really efficient. I'm aware that memcachedb exists, but I'd like to have a go at writing it myself as there's going to be a lot of custom functionality that I'll need to include later on. I'll probably be writing this in C++ (or possibly C or Java if there is a good reason to do so).

Should I be looking at database implementation (B-trees, indexes, etc.) or is that unnecessary for this kind of job? What would be a good way of storing most of the content on disk, but being able to access it quickly, utilising memory for caching?

Thanks.

+6  A: 

I'd really really encourage you to re-consider and use a third party implementation.

If you want to have a lot of problems that are not a part of your domain the yes looking into database implementation techniques such as B+Trees is the right next step.

Tom
Thanks - given the responses I'm rethinking writing my own. One of the projects I'm working on will need to store a large matrix of numerical data - the number of columns and rows could both expand, do you know of any existing systems that would be appropriate for this kind of job?
HarryM
Of course, it also depends what license you need. If you're developing a commercial library then your choice in this area is quite limited.
Richard Corden
+3  A: 

Do not reinvent the bicycle.

You're going to be a user of the storage. That is, your primary concerns are business logic and may be UI, not details of DB functioning. Leave it to DB implementors. Focus on your primary task.

For instance, try to use HBase (an analogue of Google Bigtable).

Vladimir Dyuzhev
+2  A: 

There are lots of key-value stores, from the tried and true BDB to the hip Tockyo Cabinet. If you must implement your own, i'd recommend to check Varnish sources, especially the Architecture page.

Javier
Thanks for those links - they're really helpful. Do you have any experience using BDB? I've heard mixed views about the performance of DBM, is Tokyo Cabinet faster?
HarryM
BerkeleyDB is as fast as it gets. But these days even a pure-java SQL databases (read: HSQL) provides for a very good performance, while still having all or most of RDBMS/SQL features.
Vladimir Dyuzhev