views:

173

answers:

3

I have lists of variable length where each item can be one of four unique, that I need to use as keys for another object in a map. Assume that each value can be either 0, 1, 2 or 3 (it's not integer in my real code, but a lot easier to explain this way) so a few examples of key lists could be:

[1, 0, 2, 3]
[3, 2, 1]
[1, 0, 0, 1, 1, 3]
[2, 3, 1, 1, 2]
[1, 2]

So, to re-iterate: each item in the list can be either 0, 1, 2 or 3 and there can be any number of items in a list.

My first approach was to try to hash the contents of the array, using the built in GetHashCode() in .NET to combine the hash of each element. But since this would return an int I would have to deal with collisions manually (two equal int values are identical to a Dictionary).

So my second approach was to use a quad tree, breaking down each item in the list into a Node that has four pointers (one for each possible value) to the next four possible values (with the root node representing [], an empty list), inserting [1, 0, 2] => Foo, [1, 3] => Bar and [1, 0] => Baz into this tree would look like this:

Quad Tree Diagram

Grey nodes nodes being unused pointers/nodes. Though I worry about the performance of this setup, but there will be no need to deal with hash collisions and the tree won't become to deep (there will mostly be lists with 2-6 items stored, rarely over 6).

Is there some other magic way to store items with lists of values as keys that I have missed?

+1  A: 

If there are rarely more than six items in the list, and each item has only two bits of information, then I think the structure you want for your "key lists" is called "int". :)

Just use e.g. the first 4 bits to say how 'long' the key list is (0-14) and the last 28 (or fewer) bits to store the actual key. Then use a Dictionary<int,Blah> where the int is the key list representation.

Brian
+1  A: 

[Edit - Changed answer to reflect comments by @gradbot and @Brian]

You're saying you rarely will have more than 6 elements. If you can limit the maximum to 14 elements you could use GetHashCode(). Since you only need 2 bits to store the value, 32 bits in an int will give you the posibility to create a unique hash code up to 14 elements and take the 0 value into account as well.

int[] arr = new [] { 1, 2, 3, 0, 1, 2, 3 };
public override int GetHashCode()
{
    if(arr.Length > 14) throw new Exception("max elems is 14");
    int hash = 1; // start with 1 to take into account a heading 0
    foreach (int i in arr)
    {
        hash = hash << 2;
        hash += i;
    }
    return hash;
}

If you want to make the hash reversible you would have to use some bits for the length as well. And the code could be tweaked to allow 15 elements as well as mentioned by @gradbot.

Mikael Svenson
This does not distinguish the list [0;1;2;3] from the list [1;2;3].
Brian
You can fix this by setting the initial value of the hash to 1 however you will limit yourself to 14 elements. This is the only way I can think of to handle starting and trailing zeros. You could tweak it to handle 15 elements.
gradbot
@Brian, @gradbot, I forgot about the heading 0 and have modified my code to start the hash with a 1 to allow for 14 items. Thanks for pointing it out.
Mikael Svenson
+6  A: 

Note that in F#, the Map data structure can happily use list or array elements as keys; it uses structural comparison (rather than a hashcode) to store things in a persistent tree.

let myData = [
    [0;1;3], "foo"
    [1;2], "bar"
    [3;1;2;0;3], "qux"
    ]

let mutable m = Map.empty 
for k,v in myData do
    m <- Map.add k v m

printfn "%s" (Map.find [1;2] m)
Brian