ansaurus

Question

How do you write an idiomatic Scala Quicksort function?

Answer 1

+11 A:

A few years back, I spent some time trying to optimize functional quicksort as far as I could. The following is what I came up with for vanilla List[A]:

def qsort[A : Ordering](ls: List[A]) = {
  import Ordered._

  def sort(ls: List[A])(parent: List[A]): List[A] = {
    if (ls.size <= 1) ls ::: parent else {
      val pivot = ls.head

      val (less, equal, greater) = ls.foldLeft((List[A](), List[A](), List[A]())) {
        case ((less, equal, greater), e) => {
          if (e < pivot)
            (e :: less, equal, greater)
          else if (e == pivot)
            (less, e :: equal, greater)
          else
            (less, equal, e :: greater)
        }
      }

      sort(less)(equal ::: sort(greater)(parent))
    }
  }
  sort(ls)(Nil)
}

I was able to do even better with a custom List structure. This custom structure basically tracked the ideal (or nearly ideal) pivot point for the list. Thus, I could obtain a far better pivot point in constant time, simply by accessing this special list property. In practice, this did quite a bit better than the standard functional approach of choosing the head.

As it is, the above is still pretty snappy. It's "half" tail recursive (you can't do a tail-recursive quicksort without getting really ugly). More importantly, it rebuilds from the tail end first, so that results in substantially fewer intermediate lists than the conventional approach.

It's important to note that this is not the most elegant or most idiomatic way to do quicksort in Scala, it just happens to work very well. You will probably have more success writing merge sort, which is usually a much faster algorithm when implemented in functional languages (not to mention much cleaner).

Daniel Spiewak 2010-06-03 01:30:14

Many thanks for the answer, The conclusion that quicksort can be coded tail recursively I think is quite important, if not, I suspect that there could possibly be a category of algorithms which would be off limits in Scala. I'm going to wait a little longer to see if anyone else responds.

Don Mackenzie 2010-06-03 05:36:09

Out of interest, why do you use an `Ordering` context bound rather than an `Ordered[A]` view bound? And related to this, why is the `import ord._` necessary?

Ben Lings 2010-06-03 09:50:43

@Ben `Ordering` is substantially more flexible than `Ordered` because it doesn't impose a restriction on the class hierarchy of the object in question. Anything which extends `Ordered` has a corresponding implicit instance of `Ordering`, but not vice versa. The `import ord._` brings the comparison operators into scope for type `A` so that I don't have to do something like `implicitly[Ordering[A]].gt(e, pivot)`.

Daniel Spiewak 2010-06-03 14:10:26

Given the fact that Scala has both functional and object-oriented constructs, what "category of algorithms" would be off-limits? The only ones I can think of are those that rely on a lazy evaluation scheme, which Scala does not provide.

Randall Schulz 2010-06-03 14:14:31

Actually, Scala *does* provide a lazy evaluation scheme! I've used this quite often to implement functional data structures following Okasaki's techniques.

Daniel Spiewak 2010-06-03 17:55:58

Could this be a case where the imperative side to Scala could provide the most practical and performant solution? This question is really about implementing any recursive algorithm where the call graph is a tree rather than a sequence, idiomatically in Scala (I hope that makes sense).

Don Mackenzie 2010-06-03 19:41:56

Most sorting functions are very naturally recursive. Trying to implement them using loops (i.e. imperatively or tail-recursively) is usually an exercise in frustration. Even imperative quicksort implementations are usually recursive.

Daniel Spiewak 2010-06-04 02:00:12

The call graph remark makes a lot of sense. Idiomatically, if your problem is logically polyrecursive (i.e. has more than one recursive or mutually-recursive invocation per call frame), then you should encode it in the most natural way possible. Don't bend over backwards to achieve tail-recursion. It's really good to have, but not worth contorting your code to get.Note that Scala 2.8 includes a `trampoline` library which can transform naturally-encoded polyrecursive functions into tail-recursive evaluations.

Daniel Spiewak 2010-06-04 02:02:45

Giving sort algorithms any consideration is not something that I've done much since university so I'm grateful someone clearly has. It seems quite a deep and interesting topic.

Don Mackenzie 2010-06-04 07:51:11

@Daniel just discovered that instead of the `import ord`, etc, you can do `import Ordered._` to allow the operators to be used.

Ben Lings 2010-06-06 16:32:30

Ah, very nice! I'll update my answer to reflect this approach, since it's really much cleaner.

Daniel Spiewak 2010-06-06 17:54:24

@Daniel Spiewak: If you're referring to Scala's by-name parameters, they're not the same thing as lazy parameters and it's a mistake to conflate the two evaluation schemes.

Randall Schulz 2010-06-06 18:43:41

@Randall: Scala doesn't allow lazy parameters. However, by combining call-by-name with lazy values, it is possible to fully simulate Haskell's call-by-need evaluation strategy. Scala's scheme is actually *more* powerful than Haskell's because it is possible to force a re-evaluation of a by-name parameter, while a by-need parameter will always be memoized.

Daniel Spiewak 2010-06-06 20:07:41

@Daniel: Yes, I'm aware of that, but it is clumsy beyond reason. Perhaps a metaprogramming capability could make it manageable, but as it stands, it's onerous unless you need just a smattering.

Randall Schulz 2010-06-07 01:12:46

Answer 2

+1 A:

I guess it depends on what do you mean by "idiomatic". The main advantage of quicksort is being a very fast in-place sorting algorithm. So, if you can't sort in-place, you loose all its advantages -- but you're still stuck with it's dis advantages.

So, here's some code I wrote for Rosetta Code on this very subject. It still doesn't sort in-place, but, on the other hand, it sorts any of the new collections:

import scala.collection.TraversableLike
import scala.collection.generic.CanBuildFrom
def quicksort
  [T, CC[X] <: Traversable[X] with TraversableLike[X, CC[X]]]      // My type parameters -- which are expected to be inferred
  (coll: CC[T])                                                    // My explicit parameter -- the one users will actually see
  (implicit ord: Ordering[T], cbf: CanBuildFrom[CC[T], T, CC[T]])  // My implicit parameters -- which will hopefully be implicitly available
  : CC[T] =                                                        // My return type -- which is the very same type of the collection received
  if (coll.isEmpty) {
    coll
  } else {
    val (smaller, bigger) = coll.tail partition (ord.lt(_, coll.head))
    quicksort(smaller) ++ coll.companion(coll.head) ++ quicksort(bigger)
  }

Daniel 2010-06-07 03:41:06

It will take me a while to get my head round some of the type parameter defs but I suspect this is an idiomatic approach because the algorithm is very visible in the implementation and your approach is generic too. Perhaps I got provoked into thinking there was an approach which could avoid potential stack overflows without using, for example, trampoline techniques or carrying a large amount of state.

Don Mackenzie 2010-06-07 09:11:44

ansaurus

tags:

views:

answers:

How do you write an idiomatic Scala Quicksort function?

related questions