views:

98

answers:

5

I have a variable number of ArrayList's that I need to find the intersection of. A realistic cap on the number of sets of strings is probably around 35 but could be more. I don't want any code, just ideas on what could be efficient. I have an implementation that I'm about to start coding but want to hear some other ideas.

Currently, just thinking about my solution, it looks like I should have an asymptotic run-time of Θ(n2).

Thanks for any help!

tshred

Edit: To clarify, I really just want to know is there a faster way to do it. Faster than Θ(n2).

A: 

Sort them (n lg n) and then do binary searches (lg n).

glowcoder
+6  A: 

Set.retainAll() is how you find the intersection of two sets. If you use HashSet, then converting your ArrayLists to Sets and using retainAll() in a loop over all of them is actually O(n).

Michael Borgwardt
Beat me to it :)
Chris Dennett
+1  A: 

The best option would be to use HashSet to store the contents of these lists instead of ArrayList. If you can do that, you can create a temporary HashSet to which you add the elements to be intersected (use the putAll(..) method). Do tempSet.retainAll(storedSet) and tempSet will contain the intersection.

Chris Dennett
+1  A: 

One more idea - if your arrays/sets are different sizes, it makes sense to begin with the smallest.

a1ex07
A: 

You can use single HashSet. It's add() method returns false when the object is alredy in set. adding objects from the lists and marking counts of false return values will give you union in the set + data for histogram (and the objects that have count+1 equal to list count are your intersection). If you throw the counts to TreeSet, you can detect empty intersection early.

binary_runner