ansaurus

Question

What is a data structure for quickly finding non-empty intersections of a list of sets?

Answer 1

+1 A:

One representation that might help is storing the sets I as vectors V of size n whose entries V(i) are 0 when i is not in V and positive otherwise. Then to take the intersection of two vectors you multiply the terms, and to take the union you add the terms.

Tom Smith 2010-04-06 23:07:39

Answer 2

+2 A:

I think your current solution is optimal big-O wise, though there are micro-optimization techniques that could improve its actual performance. Such as using bitwise operations when merging the chosen set in item_containing set with the valid items set.

i.e. you store items_containing as this:

items_containing = [0x0000, 0x0001, 0x0011, 0x0010, 0x0100, 0x0100]

and your valid_items can use bit-wise OR to merge like this:

int valid_items(Set I, Set candidate) {
    // if you need more than 32-items, use int[] for valid 
    // and int[][] for items_containing
    int valid = 0x0000;
    for (int item : candidate) {
        // bit-wise OR
        valid |= items_containing[item];
    }
    return valid;
}

but they don't really change the Big-O performance.

Lie Ryan 2010-04-06 23:11:03

ansaurus

tags:

views:

answers:

What is a data structure for quickly finding non-empty intersections of a list of sets?

related questions