I am experimenting with parallelizing algorithms in Java. I began with merge sort, and posted my attempt in this question. My revised attempt is in the code below, where I now try to parallelize quick sort.
Are there any rookie mistakes in my multi-threaded implementation or approach to this problem? If not, shouldn't I expect more than a 32% speed increase between a sequential and a parallelized algorithm on a duel-core (see timings at bottom)?
Here is the multithreading algorithm:
public class ThreadedQuick extends Thread
{
final int MAX_THREADS = Runtime.getRuntime().availableProcessors();
CountDownLatch doneSignal;
static int num_threads = 1;
int[] my_array;
int start, end;
public ThreadedQuick(CountDownLatch doneSignal, int[] array, int start, int end) {
this.my_array = array;
this.start = start;
this.end = end;
this.doneSignal = doneSignal;
}
public static void reset() {
num_threads = 1;
}
public void run() {
quicksort(my_array, start, end);
doneSignal.countDown();
num_threads--;
}
public void quicksort(int[] array, int start, int end) {
int len = end-start+1;
if (len <= 1)
return;
int pivot_index = medianOfThree(array, start, end);
int pivotValue = array[pivot_index];
swap(array, pivot_index, end);
int storeIndex = start;
for (int i = start; i < end; i++) {
if (array[i] <= pivotValue) {
swap(array, i, storeIndex);
storeIndex++;
}
}
swap(array, storeIndex, end);
if (num_threads < MAX_THREADS) {
num_threads++;
CountDownLatch completionSignal = new CountDownLatch(1);
new ThreadedQuick(completionSignal, array, start, storeIndex - 1).start();
quicksort(array, storeIndex + 1, end);
try {
completionSignal.await(1000, TimeUnit.SECONDS);
} catch(Exception ex) {
ex.printStackTrace();
}
} else {
quicksort(array, start, storeIndex - 1);
quicksort(array, storeIndex + 1, end);
}
}
}
Here is how I start it off:
ThreadedQuick.reset();
CountDownLatch completionSignal = new CountDownLatch(1);
new ThreadedQuick(completionSignal, array, 0, array.length-1).start();
try {
completionSignal.await(1000, TimeUnit.SECONDS);
} catch(Exception ex){
ex.printStackTrace();
}
I tested this against Arrays.sort and a similar sequential quick sort algorithm. Here are the timing results on an intel duel-core dell laptop, in seconds:
Elements: 500,000, sequential: 0.068592, threaded: 0.046871, Arrays.sort: 0.079677
Elements: 1,000,000, sequential: 0.14416, threaded: 0.095492, Arrays.sort: 0.167155
Elements: 2,000,000, sequential: 0.301666, threaded: 0.205719, Arrays.sort: 0.350982
Elements: 4,000,000, sequential: 0.623291, threaded: 0.424119, Arrays.sort: 0.712698
Elements: 8,000,000, sequential: 1.279374, threaded: 0.859363, Arrays.sort: 1.487671
Each number above is the average time of 100 tests, throwing out the 3 lowest and 3 highest cases. I used Random.nextInt(Integer.MAX_VALUE) to generate an array for each test, which was initialized once every 10 tests with the same seed. Each test consisted of timing the given algorithm with System.nanoTime. I rounded to six decimal places after averaging. And obviously, I did check to see if each sort worked.
As you can see, there is about a 32% increase in speed between the sequential and threaded cases in every set of tests. As I asked above, shouldn't I expect more than that?