I'll start with the assumption that the datasets can fit in memory. If not, then you will need something fancier.
I refer below to a "set", where I am thinking of something like a C++ std::set. I don't know the Java equivalent, but any storage scheme that permits rapid lookup (tree, hash table, whatever).
Comparing three lists: L0, L1 and L2.
- Read L0, placing each element in a set: S0.
- Read L1, placing items that match an element of S0 into a new set: S1, and discarding others.
- Discard S0.
- Read L2, keeping items that match an element of S1 and discarding others.
Update
Just realised that the question was for "n" lists, not three. However the extension should be obvious. (I hope)
Update 2
Some untested C++ code to illustrate the algorithm
#include <string>
#include <vector>
#include <set>
#include <cassert>
typedef std::vector<std::string> strlist_t;
strlist_t GetMatches(std::vector<strlist_t> vLists)
{
assert(vLists.size() > 1);
std::set<std::string> s0, s1;
std::set<std::string> *pOld = &s1;
std::set<std::string> *pNew = &s0;
// unconditionally load first list as "new"
s0.insert(vLists[0].begin(), vLists[0].end());
for (size_t i=1; i<vLists.size(); ++i)
{
//swap recently read "new" to "old" now for comparison with new list
std::swap(pOld, pNew);
pNew->clear();
// only keep new elements if they are matched in old list
for (size_t j=0; j<vLists[i].size(); ++j)
{
if (pOld->end() != pOld->find(vLists[i][j]))
{
// found match
pNew->insert(vLists[i][j]);
}
}
}
return strlist_t(pNew->begin(), pNew->end());
}