views:

210

answers:

2

I find this Quicksort partitioning approach confusing and wrong, yet it seems to work. I am referring to this pseudocode. Note: they also have a C implementation at the end of the article, but it's very different from their pseudocode, so I don't care about that.

I have also written it in C like this, trying to stay true to the pseudocode as much as possible, even if that means doing some weird C stuff:

#include <stdio.h>

int partition(int a[], int p, int r)
{
    int x = a[p];

    int i = p - 1;
    int j = r + 1;

    while (1)
    {
        do
            j = j - 1;
        while (!(a[j] <= x));

        do
             i = i + 1;
        while (!(a[i] >= x));

        if (i < j)
        {
            int t = a[i];
            a[i] = a[j];
            a[j] = t;
        }
        else
        {
            for (i = 1; i <= a[0]; ++i)
                printf("%d ", a[i]);
            printf("- %d\n", j);

            return j;
        }
    }
}


int main()
{
    int a[100] = //{8, 6,10,13,15,8,3,2,12};
                {7, 7, 6, 2, 3, 8, 4, 1};

    partition(a, 1, a[0]);
    return 0;
}

If you run this, you'll get the following output:

1 6 2 3 4 8 7 - 5

However, this is wrong, isn't it? Clearly a[5] does not have all the values before it lower than it, since a[2] = 6 > a[5] = 4. Not to mention that 7 is supposed to be the pivot (the initial a[p]) and yet its position is both incorrect and lost.

The following partition algorithm is taken from wikipedia:

int partition2(int a[], int p, int r)
{
    int x = a[r];
    int store = p;

    for (int i = p; i < r; ++i)
    {
        if (a[i] <= x)
        {
            int t = a[i];
            a[i] = a[store];
            a[store] = t;

            ++store;
        }
    }

    int t = a[r];
    a[r] = a[store];
    a[store] = t;

    for (int i = 1; i <= a[0]; ++i)
        printf("%d ", a[i]);
    printf("- %d\n", store);

    return store;
}

And produces this output:

1 6 2 3 8 4 7 - 1

Which is a correct result in my opinion: the pivot (a[r] = a[7]) has reached its final position.

However, if I use the initial partitioning function in the following algorithm:

void Quicksort(int a[], int p, int r)
{
    if (p < r)
    {
        int q = partition(a, p, r); // initial partitioning function

        Quicksort(a, p, q);
        Quicksort(a, q + 1, r); // I'm pretty sure q + r was a typo, it doesn't work with q + r.
    }
}

... it seems to be a correct sorting algorithm. I tested it out on a lot of random inputs, including all 0-1 arrays of length 20.

I have also tried using this partition function for a selection algorithm, in which it failed to produce correct results. It seems to work and it's even very fast as part of the quicksort algorithm however.

So my questions are:

  1. Can anyone post an example on which the algorithm DOESN'T work?
  2. If not, why does it work, since the partitioning part seems to be wrong? Is this another partitioning approach that I don't know about?
+2  A: 

I think the partitioning is correct. 7 is the pivot. The original array is partitioned into a sub array of length 5 containing elements less than or equal to 7 and a sub array of length 2, containing elements greater or equal to 7.

Henrik
You mean it's correct because `a[1..5]` has elements <= `7` and `a[6..7]` has elements >= `7`? That's true, but shouldn't the selected pivot reach its final position? According to my textbooks and wikipedia, it should.
IVlad
Yes. In the first link you gave, quicksort then proceeds to sort a[1..5] and a[6..7].Only in the wikipedia link a[6] is required to have its final value of 7, as here a[1..5] and a[7..7] are sorted in the recursive calls.
Henrik
Interesting, I didn't think about it like that. I always thought the partitioning algorithm is supposed to return the final position of the pivot element. Thanks for your explanation. I'll accept your answer a bit later to give others a chance to comment on this algorithm too.
IVlad
Yes, it works as Henrik has said, but it's a very wasteful way to do a quicksort to have the pivot sorted again. It has a different sort invariant than the wikipedia one ([ <= p | >= p] vs. [ <= p | > p]). The reason you don't generally see quicksorts done this way is that it isn't as efficient. Resorting the pivot will generate more stack calls as well as extra unnecessary comparisons (and probably swaps too).
Justin Peel
@Justin Peel - actually, the first implementation seems to be much faster than the wiki one, even with random pivots. But that discussion is for another question :).
IVlad
@IVlad, Yes, but the wiki one isn't very fast either. You want to use Sedgewick's converging pointers method (as used in the pseudocode, or even better in the code below the pseudocode) to greatly reduce the number of swaps. There are plenty of other optimizations to be done as well.
Justin Peel
A: 

From Wikipedia (I've emphasized the part that I think addresses your question directly):

This is the in-place partition algorithm. It partitions the portion of the array between indexes left and right, inclusively, by moving all elements less than or equal to array[pivotIndex] to the beginning of the subarray, leaving all the greater elements following them. In the process it also finds the final position for the pivot element, which it returns. It temporarily moves the pivot element to the end of the subarray, so that it doesn't get in the way. Because it only uses exchanges, the final list has the same elements as the original list. Notice that an element may be exchanged multiple times before reaching its final place. It should also be noted that in case of pivot duplicates in the input array, they can be spread across left subarray, possibly in random order. This doesn't represent a partitioning failure, as further sorting will reposition and finally "glue" them together.

Could that be what you were missing?

Dan Tao
No, I don't see what that has to do with my question. That describes the algorithm that I said definitely works, I was asking why the first one also works. Plus, the first one does not find the final position of the pivot element, nor does it temporarily move anything.
IVlad