tags:

views:

210

answers:

4

Hi all,

Just wondering if anyone would be able to take a look at this code for implementing the quicksort algorithm and answer me a few questions, please :-)

public class Run
{
  /***************************************************************************
   * Quicksort code from Sedgewick 7.1, 7.2.
   **************************************************************************/
  public static void quicksort(double[] a)
  {
    //shuffle(a); // to guard against worst-case
    quicksort(a, 0, a.length - 1, 0);
  }

  static void quicksort(final double[] a, final int left, final int right, final int tdepth)
  {
    if (right <= left)
      return;
    final int i = partition(a, left, right);

    if ((tdepth < 4) && ((i - left) > 1000))
    {
      final Thread t = new Thread()
      {
        public void run()
        {
          quicksort(a, left, i - 1, tdepth + 1);
        }
      };
      t.start();
      quicksort(a, i + 1, right, tdepth + 1);

      try
      {
        t.join();
      }
      catch (InterruptedException e)
      {
        throw new RuntimeException("Cancelled", e);
      }
    } else
    {
      quicksort(a, left, i - 1, tdepth);
      quicksort(a, i + 1, right, tdepth);
    }
  }

  // partition a[left] to a[right], assumes left < right
  private static int partition(double[] a, int left, int right)
  {
    int i = left - 1;
    int j = right;
    while (true)
    {
      while (less(a[++i], a[right]))
        // find item on left to swap
        ; // a[right] acts as sentinel
      while (less(a[right], a[--j]))
        // find item on right to swap
        if (j == left)
          break; // don't go out-of-bounds
      if (i >= j)
        break; // check if pointers cross
      exch(a, i, j); // swap two elements into place
    }
    exch(a, i, right); // swap with partition element
    return i;
  }

  // is x < y ?
  private static boolean less(double x, double y)
  {
    return (x < y);
  }

  // exchange a[i] and a[j]
  private static void exch(double[] a, int i, int j)
  {
    double swap = a[i];
    a[i] = a[j];
    a[j] = swap;
  }

  // shuffle the array a[]
  private static void shuffle(double[] a)
  {
    int N = a.length;
    for (int i = 0; i < N; i++)
    {
      int r = i + (int) (Math.random() * (N - i)); // between i and N-1
      exch(a, i, r);
    }
  }

  // test client
  public static void main(String[] args)
  {
    int N = 5000000; // Integer.parseInt(args[0]);

    // generate N random real numbers between 0 and 1
    long start = System.currentTimeMillis();
    double[] a = new double[N];
    for (int i = 0; i < N; i++)
      a[i] = Math.random();
    long stop = System.currentTimeMillis();
    double elapsed = (stop - start) / 1000.0;
    System.out.println("Generating input:  " + elapsed + " seconds");

    // sort them
    start = System.currentTimeMillis();
    quicksort(a);
    stop = System.currentTimeMillis();
    elapsed = (stop - start) / 1000.0;
    System.out.println("Quicksort:   " + elapsed + " seconds");

  }
}

My questions are:

  1. What is the purpose of the variable tdepth?

  2. Is this considered a "proper" implementation of a parallel quicksort? I ask becuase it doesn't use implements Runnable or extends Thread...

  3. If it doesn't already, is it possible to modify this code to use multiple threads? By passing in the number of threads you want to use as a parameter, for example...?

Many thanks,

Brian

+5  A: 

1. It's used to keep track of recursion depth. This is checked to decide whether to run in parallel. Notice how when the function runs in parallel it passes tdepth + 1 (which becomes tdepth in the called quicksort's parameters). This is a basic way of avoiding too many parallel threads.

2. Yes, it's definitely using another thread. The code:

new Thread()
{
  public void run()
  {
    quicksort(a, left, i - 1, tdepth + 1);
  }
};

creates an anonymous inner class (which extends Thread), which is then started.

Matthew Flaschen
@Dolph, actually the anonymous inner class does extend Thread.
Matthew Flaschen
Thanks for the responses guys. I see what ``tdepth`` is for now (and that answers Q3 too)As for Anonymous inner classes - I've never come across them but I'll have a read up on them on Sun's website, as linked to.Thanks again everyone. What a great site :-)
Brian
@Matthew, yeah I worded that incredibly poorly... The anonymous inner class *anonymously extends* `java.lang.Thread` and simply overrides `Thread.run()`, so nothing has to explicitly `extend Thread`. Furthermore, `java.lang.Thread` implements `Runnable` itself.
Dolph
+2  A: 
  1. Apparently, tdepth is used to avoid creating too many threads

  2. It uses an anonymous class, which implicitly extends Thread

  3. It does that already (see point 1.)

Maurice Perry
A: 
  1. tdepth is there so that there's an upper bound on the number of threads created. Note that ever time the method calls itself recursively (which is done in a new thread), tdepth is incremented by one. This way, only the first four levels of recursion will create new threads, presumably to prevent overloading the OS with many threads for little benefit.

  2. This code launches its own threads in the definition of the quicksort method, so it will use parallel processing. One might argue that it could do with some kind of thread management and that e.g. some kind of Executor might be better, but it is definitely parallel. See the call to new Thread() ... followed by start(). Incidentally, the call to t.join() will cause the current thread to wait for the thread t to finish, in case you weren't aware of that.

  3. This code already uses multiple threads, but you can tweak how many it spawns given the comparison on tdepth; increasing or decreasing the value will determine how many levels of recursion create threads. You could complete rewrite the code to use executors and threadpools, or perhaps to perform trinary recursion instead of binary - but I suspect that in the sense you asked; no, there's no simple way to tweak the number of threads.

Andrzej Doyle
I don't think "the current thread handles the last 32000 on its own" is right. When the current thread calls quicksort recursively, it will then split off another thread to handle the 3rd quarter, etc.
Matthew Flaschen
@Andrzej Doyle: nonsense. The first thread picks a pivot and performs *n* comparisons. After that there are already two threads working, etc. The first pass doing n comparison is *not* what takes the longest in QuickSort. You may mistakingly be thinking that after the first pass half of the data is sorted. But it is really not: all you know is upper half is bigger than the pivot and lower half is smaller.
Webinator
"Nonsense" is right, duly removed! I overlooked the nature of the recursive call.
Andrzej Doyle
A: 

I did actually wrote a (correctly) multi-threaded QuickSort in Java so maybe I can help a bit...

Question here for anyone interested:

http://stackoverflow.com/questions/2210185/correctly-multithreaded-quicksort-or-mergesort-algo-in-java

What is the purpose of the variable tdepth?

as other have commented, it serves to determine whether to create new threads or not.

Is this considered a "proper" implementation of a parallel quicksort? I ask because it doesn't use implements Runnable or extends Thread...

I don't think it's that proper for several reasons: first you should make it CPU dependent. There's no point in spawning 16 threads on a CPU that has just one core: a mono-threaded QuickSort shall outperfom the multi-threaded one on a single core machine. On a 16-cores machines, sure, fire up to 16 threads.

Runtime.getRuntime().availableProcessors()

Then the second reason I really don't like it is that it is using last-century low-level Java idiosyncrasish threading details: I prefer to stay away from .join() and use higher level things (see fork/join in the other question or something like CountDownLatch'es, etc.). The problem with things low-level like Java's thread "join" is that it carries no useful meaning: this is 100% Java specific and can be replaced by higher-level threading facilities whose concept are portable across languages.

Then don't comment the shuffle at the beginning. Ever. I've seen dataset where QuickSort degrades quadratically if you remove that shuffle. And it's just an O(n) shuffle, that won't slow down your sort :)

If it doesn't already, is it possible to modify this code to use multiple threads? By passing in the number of threads you want to use as a parameter, for example...?

I'd try to write and/or reuse an implementation using higher-level concurrency facilities. See the advices in the question I asked here some time ago.

Webinator