views:

1747

answers:

4

Do you know of any library that would provide a well-tested concurrent quicksort or mergesort algorithm for Java?

We've had issues on a 16-(virtual)-cores Mac where only one core (!) was working using the default Java sorting algo and it was, well, not good to see that very fine machine be completely underused. So we wrote our own (I wrote it) and we did indeed gain good speedups (I wrote a multithreaded quicksort and due to its partitioning nature it parallelize very well but I could have written a mergesort too)... But my implementation only scales up to 4 threads, it's proprietary code, and I'd rather use one coming from a reputable source instead of using my re-invented wheel.

The only one I found on the Web is an example of how not to write a multi-threaded quicksort in Java, it is busy-looping (which is really terrible) using a:

while (helpRequested) { }

http://broadcast.oreilly.com/2009/06/may-column-multithreaded-algor.html

So in addition to losing one thread for no reason it's making sure to kill the perfs by busy-looping in that while loop (which is mindboggling).

Hence my question: do you know of any correctly multithreaded quicksort or mergesort implementation in Java that would be coming from a reputable source?

I put the emphasis on the fact that I know that the complexity stays O(n log n) but I'd still enjoy very much to see all these cores start working instead of idling. Note that for other tasks, on that same 16 virtual cores Mac, I saw speedup of up to x7 by parallelizing the code (and I'm by no mean an expert in concurrency).

So even tough the complexity stays O(n log n), I'd really appreciate a x7 or x8 or even x16 speedup.

+7  A: 

give a try to fork/join framework by Doug Lea:

public class MergeSort extends RecursiveAction {
    final int[] numbers;
    final int startPos, endPos;
    final int[] result;

    private void merge(MergeSort left, MergeSort right) {
        int i=0, leftPos=0, rightPos=0, leftSize = left.size(), rightSize = right.size();
        while (leftPos < leftSize && rightPos < rightSize)
            result[i++] = (left.result[leftPos] <= right.result[rightPos])
                ? left.result[leftPos++]
                : right.result[rightPos++];
        while (leftPos < leftSize)
            result[i++] = left.result[leftPos++];
        while (rightPos < rightSize)
        result[i++] = right.result[rightPos++];
    }

    public int size() {
        return endPos-startPos;
    }

    protected void compute() {
        if (size() < SEQUENTIAL_THRESHOLD) {
            System.arraycopy(numbers, startPos, result, 0, size());
            Arrays.sort(result, 0, size());
        } else {
            int midpoint = size() / 2;
            MergeSort left = new MergeSort(numbers, startPos, startPos+midpoint);
            MergeSort right = new MergeSort(numbers, startPos+midpoint, endPos);
            coInvoke(left, right);
            merge(left, right);
        }
    }
}

(source: http://www.ibm.com/developerworks/java/library/j-jtp03048.html?S_TACT=105AGX01&amp;S_CMP=LP)

dfa
@dfa: +1, a wonderful paper that I didn't know about and a great article, excellent!
Webinator
A: 

You probably did consider this, but it might help to look at the concrete problem from a higher level, for example if you don't sort just one array or list it might be much easier to sort individual collections concurrently using the traditional algorithm instead of trying to concurrently sort a single collection.

Fabian Steeg
+1  A: 

I don't know a library but here is my take derived from http://www.cs.princeton.edu/introcs/42sort/QuickSort.java.html by Bob Sedgewick and Kevin Wayne

import java.util.Arrays;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.atomic.AtomicLong;

//  derived from 
//  http://www.cs.princeton.edu/introcs/42sort/QuickSort.java.html
//  Copyright © 2007, Robert Sedgewick and Kevin Wayne.
public class QuickSort {

  private AtomicLong comparisons = new AtomicLong(0);
  private AtomicLong exchanges = new AtomicLong(0);
  private AtomicLong threadCount = new AtomicLong(0);
  private AtomicLong partitionerCount = new AtomicLong(0);

  private ExecutorService executor = Executors.newCachedThreadPool( new ThreadFactory() {
    public Thread newThread( Runnable r ) {
      threadCount.incrementAndGet();
      return new Thread(r);
    }
  });

  public synchronized void quicksort(double[] a) throws InterruptedException, ExecutionException {
    //shuffle(a);                        // to guard against worst-case
    quicksort(a, 0, a.length - 1);
  }

  // quicksort a[left] to a[right]
  private  void quicksort(double[] a, int left, int right) throws InterruptedException, ExecutionException {
    if (right <= left)
      return;
    int i = executor.submit( new Partitioner(a, left, right)).get();
    quicksort(a, left, i - 1);
    quicksort(a, i + 1, right);
  }

  private class Partitioner implements Callable<Integer> {
    private double[] a;
    private int left;
    private int right;

    // partition a[left] to a[right], assumes left < right
    public Partitioner(double[] a, int left, int right) {
      this.a = a;
      this.left = left;
      this.right = right;
      partitionerCount.incrementAndGet();
    }
    public Integer call() {
      int i = left - 1;
      int j = right;
      while (true) {
        while (less(a[++i], a[right]))
          // find item on left to swap
          ; // a[right] acts as sentinel
        while (less(a[right], a[--j]))
          // find item on right to swap
          if (j == left)
            break; // don't go out-of-bounds
        if (i >= j)
          break; // check if pointers cross
        exch(a, i, j); // swap two elements into place
      }
      exch(a, i, right); // swap with partition element
      return i;
    }

 // is x < y ?
    private boolean less(double x, double y) {
      comparisons.incrementAndGet();
      return (x < y);
    }

 // exchange a[i] and a[j]
    private void exch(double[] a, int i, int j) {
      exchanges.incrementAndGet();
      double swap = a[i];
      a[i] = a[j];
      a[j] = swap;
    }
  }

  public void finalize() {
    try {
      executor.shutdown();
    } catch ( Throwable e ) {}
  }

  // test client
  public static void main(String[] args) throws InterruptedException, ExecutionException {
    int N = args.length == 0 ? 1000000 : Integer.parseInt(args[0]);

    // generate N random real numbers between 0 and 1
    long start = System.currentTimeMillis();
    double[] a = new double[N];

    for (int i = 0; i < N; i++)
      a[i] = Math.random();
    long stop = System.currentTimeMillis();
    double elapsed = (stop - start) / 1000.0;
    System.out.println("Generating " + N + " input:  " + elapsed + " seconds");

    // sort them
    start = System.currentTimeMillis();
    QuickSort q = new QuickSort();
    q.quicksort(a);
    stop = System.currentTimeMillis();
    elapsed = (stop - start) / 1000.0;
    System.out.println("Quicksort:   " + elapsed + " seconds");

    // print statistics
    System.out.println("Comparisons:  " + q.comparisons.get());
    System.out.println("Exchanges:    " + q.exchanges.get());
    System.out.println("Threads:      " + q.threadCount.get());
    System.out.println("Partitioners: " + q.partitionerCount.get());
    double[] b = new double[a.length];
    System.arraycopy(a,0,b,0,a.length);
    Arrays.sort(b);
    System.out.println("isSorted:     " + Arrays.equals(a,b));

  }
}
Clint
This is actually awful compared to the Java Arrays.sort()
Clint
What is afwul about your own answer compared to the Java Arrays.sort()? If you look at the Java source for Arrays.sort() you'll not see "elegant" code. Is it awful because it's not hidden behind a method call? To me if it brings a huge speedup, like the paper by Doug Lea seems to imply, it's certainly nothing awful. That said I don't know why you commented out the shuffle: I experienced ugly worst case behavior the times I tried to comment it :)
Webinator
For 1000000 it takes several seconds compared to tenths of seconds for the Arrays.sort(). I'm not commenting on the original code just my modifications to spread it across multiple threads.
Clint
I personally would not guarantee that this implementation would really work on multiprocessor-machines because there's no mechanism involved ensuring that the data in the array is correctly visible by all threads (Memory barrier), one processor-cache involved and the algorithm will not work.
Tobias P.
A: 

Why do you think a parallel sort would help? I'd think most sorting is i/o bound, not processing. Unless your compare does a lot of calculations, a speedup is unlikely.

Stephan Eggermont