views:

61

answers:

2

I have two arrays, N and M. they are both arbitrarily sized, though N is usually smaller than M. I want to find out what elements in N also exist in M, in the fastest way possible.

To give you an example of one possible instance of the program, N is an array 12 units in size, and M is an array 1,000 units in size. I want to find which elements in N also exist in M. (There may not be any matches.) The more parallel the solution, the better.

I used to use a hash map for this, but it's not quite as efficient as I'd like it to be.

Typing this out, I just thought of running a binary search of M on sizeof(N) independent threads. (Using CUDA) I'll see how this works, though other suggestions are welcome.

+1  A: 

1000 is a very small number. Also, keep in mind that parallelizing a search will only give you speedup as the number of cores you have increases. If you have more threads than cores, your application will start to slow down again due to context switching and aggregating information.

A simple solution for your problem is to use a hash join. Build a hash table from M, then look up the elements of N in it (or vice versa; since both your arrays are small it doesn't matter much).

Edit: in response to your comment, my answer doesn't change too much. You can still speed up linearly only until your number of threads equals your number of processors, and not past that.

If you want to implement a parallel hash join, this would not be difficult. Start by building X-1 hash tables, where X is the number of threads/processors you have. Use a second hash function which returns a value modulo X-1 to determine which hash table each element should be in.

When performing the search, your main thread can apply the auxiliary hash function to each element to determine which thread to hand it off to for searching.

danben
That was just an example. The size of M varies from run to run, and can be arbitrarily large (millions of elements). The size of M doesn't change within a run, but the program uses N and M as work units of sorts, so that once they've both been compared for matches, they're filled with new data, and compared again, over and over.
GenTiradentes
+1  A: 

Just sort N. Then for each element of M, do a binary search for it over sorted N. Finding the M items in N is trivially parallel even if you do a linear search over an unsorted N of size 12.

phkahler