views:

155

answers:

2

I'd like to implement a data structure similar to the lookup functionality of Amazon S3. For those of you who don't know what I'm taking about, Amazon S3 stores all files at the root, but allows you to look up groups of files by common prefixes in their names, therefore replicating the power of a directory tree without the complexity of it.

The catch is, both lookup and filter operations are O(1) (or close enough that even on very large buckets - S3's disk equivalents - both operations might as well be O(1))).

So in short, I'm looking for a data structure that functions like a hash map, with the added benefit of efficient (at the very least not O(n)) filtering. The best I can come up with is extending HashMap so that it also contains a (sorted) list of contents, and doing a binary search for the range that matches the prefix, and returning that set. This seems slow to me, but I can't think of any other way to do it.

Does anyone know either how Amazon does it, or a better way to implement this data structure?

+2  A: 

Just to validate my claim that a regular TreeMap should suffice for any bucket with as many as 1,000,000 entries, here's a very simple test case that gives some numbers (attention: this is not meant as a microbenchmark, it's just to get a feeling for the magnitude of this problem).

I've used randomly generated UUIDs to mimic keys (If you'd replace dashes with slashes, you would even get kind of a directory structure). Afterwards, I've put them into a regular java.util.TreeMap and finally queried them with map.subMap(fromKey, toKey).

public static void main(String[] args) {

    TreeMap<String, Object> map = new TreeMap<String, Object>();

    int count = 1000000;
    ArrayList<String> uuids;

    {
        System.out.print("generating ... ");
        long start = System.currentTimeMillis();
        uuids = new ArrayList<String>(count);
        for (int i = 0; i < count; i++) {
            uuids.add(UUID.randomUUID().toString());
        }
        System.out.println((System.currentTimeMillis() - start) + "ms");
    }

    {
        System.out.print("inserting .... ");
        long start = System.currentTimeMillis();

        Object o = new Object();
        for (int i = 0; i < count; i++) {
            map.put(uuids.get(i), o);
        }

        System.out.println((System.currentTimeMillis() - start) + "ms");
    }

    {
        System.out.print("querying ..... ");

        String from = "be400000-0000-0000-0000-000000000000";
        String to =   "be4fffff-ffff-ffff-ffff-ffffffffffff";

        long start = System.currentTimeMillis();

        long matches = 0;

        for (int i = 0; i < count; i++) {
            Map<String, Object> result = map.subMap(from, to);
            matches += result.size();
        }

        System.out.println((System.currentTimeMillis() - start) + "ms (" + matches/count
                + " matches)");

    }
}

and here is some sample output from my machine (1,000,000 keys, 1,000,000 range queries):

generating ... 6562ms
inserting .... 2933ms
querying ..... 5344ms (229 matches)

Inserting 1 key took an average of 0.003 ms (certainly more towards the end though) while querying a sub range with 229 matches took 0.005 ms per query. That's some pretty sane performance, isn't it?

After increasing the number to 10,000,000 keys and queries, the numbers are as follows:

generating ...  59562ms
inserting ....  47099ms
querying ..... 444119ms (2430 matches)

Inserting 1 key took an average of 0.005 ms while querying a sub range with 2430 matches took 0.044 ms per query. Even though querying got 10 times slower (at the end, it's iterating through all matches which always is O(n)) the performance still isn't too bad either.

As S3 is a cloud service, I'd assume that it's pretty much limited by networking anyway. Hence there's no urgent need for an extremely fancy data structure to get the required performance. Still, some features are missing from my test case, most notably concurrency and persistence. Nevertheless, I think I've shown that a regular tree structure is sufficient for this use case. If you want to do something fancy, experiment with subtree read-write locking and maybe a replacement for .subMap(fromKey, toKey);

sfussenegger
Thanks a lot. I guess I just discarded red black trees without really thinking about them, but you're right, they're more than fast enough for most any usage.
dimo414
+1  A: 

Just to append to sfussinigger's answer; concurrency is very easy using ConcurrentSkipListMap, and it has properties similar to a TreeMap. It's not too "fancy" a data structure (and in any case, it's already implemented for you). It's certainly easier than sub-tree read-write locking.

Rudiger
+1 I'd even say that a skip list is an astonishingly simple data structure. thanks for reminding me of it.
sfussenegger