tags:

views:

89

answers:

4

I have a need for an efficient sort that doesn't have a callback, but is as customizable as using qsort(). What I want is for it to work like an iterator, where it continuously calls into the sort API in a loop until it is done, doing the comparison in the loop rather than off in a callback function. This way the custom comparison is local to the calling function (and therefore has access to local variables, is potentially more efficient, etc). I have implemented this for an inefficient selection sort, but need it to be efficient, so prefer a quick sort derivative.

Has anyone done anything like this? I tried to do it for quick sort, but trying to turn the algorithm inside out hurt my brain too much.

Below is how it might look in use.

// the array of data we are sorting
MyData array[5000], *firstP, *secondP;

// (assume data is filled in)

Sorter sorter;

// initialize sorter
int result = sortInit (&sorter, array, 5000,
        (void **)&firstP, (void **)&secondP, sizeof(MyData));

// loop until complete
while (sortIteration (&sorter, result) == 0) {
    // here's where we do the custom comparison...here we
    // just sort by member "value" but we could do anything
    result = firstP->value - secondP->value;
    }
A: 

What you're asking for has already been done -- it's called std::sort, and it's already in the C++ standard library. Better support for this (among many other things) is part of why well-written C++ is generally faster than C.

Jerry Coffin
Pointing out that C++'s sort function is better isn't terribly useful when the question is tagged 'C'.
Mark Ransom
My understanding of std::sort is that it requires a compare function unless you are sorting one of the standard types that it supports. As Mark points out, I need it for C, but if std::sort actually did it the way I want it, I could borrow an implementation and convert it to work in C. That doesn't appear to be the case though.
rob
std::sort can either use a compare function or a compare object. Being able to use an object gives it a lot more flexibility.
Mark Ransom
The bottom line is that there *is* no good way to handle this in C -- but his C code is probably trivial to convert to C++, where good solutions exist. Yes, there's a comparison function (or funtor) in C++, but since `std::sort` is a template, the comparison will *normally* be generated as inline code. In theory you could do the same with some C macros, but I doubt it's feasible. In short, converting to C++ by (by a wide margin) the most practical solution.
Jerry Coffin
@Jerry: Can you really recommend converting to C++ without knowing more about the program like what platform(s) and how many lines of code? For programs over 100,000 lines, converting to C++ will be a lot more work than trying to adapt `std::sort`'s technique to vanilla C. For all we know, rob (OP) is on an embedded platform and doesn't have a C++ compiler available.
tomlogic
@tomlogic:yes, there are (a few) possibilities that would rule out C++. Most of them are fairly low-likelihood possibilities and he hasn't given even a hint that they apply in this case. That leaves C++ as the first choice, by a pretty wide margin.
Jerry Coffin
@Jerry: there are some compelling reasons to avoid C++ and stick to C (or C + some high level extension language), some are listed here --> http://thread.gmane.org/gmane.comp.version-control.git/57643/focus=57918 (contains strong language)
Matt Curtis
@Matt: I've seen the rant before. The only compelling (or any other kind of) reason you can find there is purely political: if you want to write Linux kernel code, C++ will obviously be rejected out of hand, so that's a reason to use C. From a technical viewpoint, he seems to do little more than display willful ignorance.
Jerry Coffin
C or C++, I don't see how std::sort would help. Maybe someone can provide a link to example code where the comparison code, rather than being "elsewhere", is right there within the function that initiates the sorting operation.Just like I don't want to iterate an array by having a callback function called for every array member -- preferring to be able to access the array member right there in the for loop block -- I would love to have the same for sort. I don't see how std::sort provides this.
rob
@rob: `std::sort` helps the compiler generate the **object** code for the comparison inline. If you want the **source** code for the comparison inline, you could use something like `Boost::lambda`.
Jerry Coffin
@Jerry, I think you must be missing the point. I want to sort things without having to add or change any code outside the function or block where I want to sort them. And I want the comparison operation to have access to local variables. As I showed in my example, this is clearly possible (and I have implemented it for an inefficient sorting algorithm). Are you claiming this is possible in C++ with existing stuff? From what I read, I don't think it is.
rob
@rob: yes, a Lambda has access to local variables, and you'd be able to write your comparison inline as well.
Jerry Coffin
cool, I will look into that
rob
+2  A: 

Turning the sort function inside out as you propose isn't likely to make it faster. You're trading indirection on the comparison function for indirection on the item pointers.

It appears you want your comparison function to have access to state information. The quick-n-dirty way to create global variables or a global structure, assuming you don't have more than one thread going at once. The qsort function won't return until all the data is sorted, so in a single threaded environment this should be safe.

The only other thing I would suggest is to locate a source to qsort and modify it to take an extra parameter, a pointer to your state structure. You can then pass this pointer into your comparison function.

Mark Ransom
Well, I'm not really trying to make it faster (although I had hoped it might due to making it more inlinable), I just don't want to make it a lot slower, as my implementation with selection sort is.I do need to support threaded environments.I could do as you suggest with adding a state pointer to qsort...that would help a lot, but ultimately I was hoping to do "iterator style" simply because I prefer that API by a wide margin. I've been using my implementation (using selection sort) for a few years (when performance isn't a big factor) and it just keeps things so clean and easy to use.
rob
+1  A: 

Take an existing implementation of qsort and update it to reference the Sorter object for its local variables. Instead of calling a compare function passed in, it would update its state and return to the caller.

Because of recursion in qsort, you'll need to keep some sort of a state stack in your Sorter object. You could accomplish that with an array or a linked-list using dynamic allocation (less efficient). Since most qsort implementations use tail recursion for the larger half and make a recursive call to qsort for the smaller half of the pivot point, you can sort at least 2n elements if your array can hold n states.

tomlogic
Yes, I think that is exactly what I was trying to do but "returning to the caller" rather than calling the compare function in effect turns the whole thing inside out, and while I know it can be done, it is extremely difficult....so I was hoping someone had already done it somewhere. :)
rob
A: 

You could write a preprocessor macro to output a sort routine, and have the macro take a comparison expression as an argument.


#define GENERATE_SORT(name, type, comparison_expression) \
  void name(type* begin, type* end) \
  { /* ... when needed, fill a and b and use comparison_expression */ }

GENERATE_SORT(sort_ints, (*a<*b))

void foo()
{
    int array[10];
    sort_ints(array, array+10);
}

Matt Curtis
Here's an example of this approach applied to data structures --> http://attractivechaos.wordpress.com/2008/09/02/generic-programming-in-c/ <-- It's not for the feint hearted and personally I'd hate to be debugging that beast, but if you had already optimised everything else and really wanted to squeeze out the overhead of a function call, you could do it in vanilla C.
Matt Curtis
Eek. I've seen a macro'd up qsort, its quite scary as well. However my intention with this request has more to do with an easier to use API without needing globals, than it does with outright speed. I just find it annoying to have to use callback functions rather than do the work "right there".
rob
Maybe we can petition the C standards committee to add lambas! That will open the door for macros, tail recursion, garbage collection, and lots more parens ;-)
Matt Curtis