ansaurus

Question

What is the easiest way to simulate a database table with an index in a key value store?

Answer 1

+2 A:

Use a hashtable or dictionary. If you want unique key values you could use a GUID or hashcode.

Tony 2010-02-16 13:06:26

Do you mean a hastable for the primary key? And where does a dictionary fit in?

Zubair 2010-02-16 13:10:11

In fact, I think a hashtable will probably suit you better. Dictionary is just another possible option.

Tony 2010-02-16 13:46:30

Answer 2

A:

The key-value store should support ordering the keys and ranged access to the keys.

Then you should create two dictionaries:

id -> payload

and

col1, id -> NULL

, where payload should contain all the data the database table would contain, and the keys of the second dictionary should contain the values of (col1, id) from each entry of the first dictionary.

Quassnoi 2010-02-16 13:09:51

The key value store I am using does not support ordering of keys or ranged access.

Zubair 2010-02-16 13:11:56

Also, could you rephrase the answer, I didn't understand the part about dictionaries. Also what is col1?

Zubair 2010-02-16 13:12:46

Answer 3

+1 A:

If you use Redis (an advanced key-value store that supports strings, lists, sets, etc.) Then this is quite easy. I have already developed a C# redis client that has native support for storing POCO's data models. These exact same POCO's can be used by OrmLite to store it in a RDBMS.

By the way Redis is fast, I have a benchmark that stores and retrieves the entire Northwind Database (3202 records) in under 1.2 seconds (running inside a UnitTest on a 3yo iMac).

I store entities in two ways

Distinct entities, where I combine the Class type name and Primary Key to create a unique key e.g. urn:user:1
- I then maintain a separate set of primary keys (in a Redis Set) to keep track of all my entities, using a key like: ids:user
In a Redis server side list - which acts very much like a table with support for paging, using a key like: lists:user

mythz 2010-02-16 13:41:07

I used Redis in the past, and yes, it is blindingly fast, I loved it, especially all the range, set and atomic (incr and decr) operations it has. Its just that now I am dealing with a system which stores several billion records in a dynamically expandable database (Riak), so Redis isn't quite right for this, as everything has to fit into memory

Zubair 2010-02-16 13:48:29

Actually redis trunk includes a VM which will allow you to store 10 million keys (with any sized value) using only 1.6GB memory, while still remaining blazingly fast - see: http://antirez.com/post/redis-virtual-memory-story.html

mythz 2010-02-16 13:52:46

Yeah, I have seen that, very cool!! In fact, even Riak has a Redis based backend. It still doesn't scale to billions of records though, but very nice.

Zubair 2010-02-16 14:01:25

Isn't billions of records a hardware problem? We've got entry-level dell servers with 64GB RAM. 64GB ~= 400M keys. So 3 servers = 1.2 billion records. Use a client with consistent hashing and you're there.

mythz 2010-02-16 14:07:51

Yes that is one possibility. But also once you start running Riak range queries and ordered lists over the cluster you have to perform distributed merges and stuff like that, and the performance goes down 1000 fold (something I have tried). I would recommend trying it though. In another project we did such what you described, but we had nine servers (3 copies of the data). We ended up handcoding all the index and range queries so they worked over the cluster in the end as Redis doesn't support these yet. Looking back we should have used MongoDB for that project

Zubair 2010-02-16 14:14:27

I guess that is a question in itself. IF there is a way I can do fast range queries in redis and get fast ordered lists then I could use it!

Zubair 2010-02-16 14:34:44

http://stackoverflow.com/questions/2273726/what-is-the-most-efficient-way-to-perform-a-range-query-over-a-cluster-of-nodes

Zubair 2010-02-16 14:54:14

It sounds like you may be able to use Redis 'ordered sets' and 'ZRANGEBYSCORE' in order to get a dataset range by score value.

mythz 2010-02-16 15:19:34

How will the score help with the merge of the resultsets (or are we talking about different things now)??

Zubair 2010-02-16 16:12:13

We might be. If you can compile your query for each value into a 'score' that you store in an 'Ordered Set', you can then make a single call to each Redis instance to get all values that match that score (i.e. your query). A union of these results should contain all values that match your score.

mythz 2010-02-16 16:43:42

I'll admit, I did not know about this "score" thing. Is it an algorithm I can read about on the web somewhere, or is ot something you have invented?

Zubair 2010-02-16 21:06:23

Nah it's just using a double to order values in a sorted set. For a good explanation see: http://groups.google.com/group/redis-db/browse_thread/thread/a3d9b4743017e90b/07585626443bd275

mythz 2010-02-17 11:03:45

I read the thread but still didn't understand it. is there a simple explanation of it anywhere on the web, what should I even be googling, is it "Double Order search"?

Zubair 2010-02-17 12:06:34

ansaurus

tags:

views:

answers:

What is the easiest way to simulate a database table with an index in a key value store?

related questions