views:

416

answers:

2

I have found a way that improves (as far as I have tested) upon the quicksort algorithm beyond what has already been done. I am working on testing it and then I want to get the word out about it. However, I would appreciate some help with some things. So here are my questions. All of my code is in C++ by the way.

  1. One of the sorts I have been comparing to my quicksort is the std::sort from the C++ Standard Library. However, it appears to be extremely slow. I am only sorting arrays of ints and longs, but it appears to be around 8-10 times slower than both my quicksort and a standard quicksort by Bentley and McIlroy (and maybe Sedgewick). Does anyone have any ideas as to why it is so slow? The code I use for the sort is just std::sort(a,a+numelem); where a is the array of longs or ints and numelem is the number of elements in the array. The numbers are very random, and I have tried different sizes as well as different amounts of repeated elements. I also tried qsort, but it is even worse as I expected. Edit: Ignore this first question - it's been resolved.

  2. I would like to find more good quicksort implementations to compare with my quicksort. So far I have a Bentley-McIlroy one and I have also compared with the first published version of Vladimir Yaroslavskiy's dual-pivot quicksort. In addition, I plan on porting timsort (which is a merge sort I believe) and the optimized dual-pivot quicksort from the jdk 7 source. What other good quicksorts implementations do you know about? If they aren't in C or C++ that might be okay because I am pretty good at porting, but I would prefer C or C++ ones if you know of them.

  3. How would you recommend getting out the word about my additions to the quicksort? So far my quicksort seems to be significantly faster than all other quicksorts that I've tested it against. The main source of its speed is that it handles repeated elements much more efficiently than other methods that I've found. It almost completely eradicates worst case behavior without adding much time in checking for repeated elements. I posted about it on the Java forums, but got no response. I also tried writing to Jon Bentley because he was working with Vladimir on his dual-pivot quicksort and got no response (though I wasn't terribly surprised by this). Should I write a paper about it and put it on arxiv.org? Should I post in some forums? Are there some mailing lists to which I should post? I have been working on this for some time now and my method is legit. I do have some experience with publishing research because I am a PhD candidate in computational physics. Should I try approaching someone in the Computer Science department of my university? By the way, I have also developed a different dual-pivot quicksort, but it isn't better than my single-pivot quicksort (though it is better than Vladimir's dual-pivot quicksort with some datasets).

I really appreciate your help. I just want to add what I can to the computing world. I'm not interested in patenting this or any absurd thing like that.

+9  A: 

If you have confidence in your work, definitely try to discuss it with someone knowledgeable at your university as soon as possible. It's not enough to show that your code runs faster than another procedure on your machine. You have to mathematically prove whatever performance gain you claim to have achieved through analysis of your algorithm. I'd say the first thing to do is make sure both algorithms you are comparing are implemented and compiled optimally - you may just be fooling yourself here. The likelihood of an individual achieving such a marked improvement upon such an important sorting method without already having thorough knowledge of its accepted variants just seems minuscule. However, don't let me discourage you. It should be interesting anyway. Would you be willing to post the code here? ...Also, since quicksort is especially vulnerable to worst-case scenarios, the tests you choose to run may have a huge effect, as will the choice of pivots. In general, I would say that any data set with a large number of equivalent elements or one that is already highly sorted is never a good choice for quicksort - and there are already well-known ways of combating that situation, and better alternative sorting methods.

Eric Mickelsen
I have been working through various improvements to quicksorts on and off for a few months. I feel like I am pretty well aware of the main improvements including better pivot choosing (median-of-3 or randomization), using iteration instead of recursion(which I am ignoring for now for simplicity and only comparing recursive functions), sorting the smaller partition first, using insertion sort when the size of an array gets small enough, Sedgewick's converging pointers method, and some others. There are also the methods for handling repeated elements(Dutch National Flag and Bentley-McIlroy)
Justin Peel
I have also tried to find my improvement because I thought that someone else had to thought it up, but I haven't been able to find it anywhere.
Justin Peel
@Justin I find it odd that you have such an array of knowledge on the topic of quicksort, enough to make you believe you have improved the algorithm, yet don't know how to test your improvements properly even to the point of not isolating scalar improvements offered by your development and operating environments.
San Jacinto
This is because while I started learning C/C++ many years ago, I have mostly been working in higher-level languages like Matlab and Python for the quick development of physics codes. However, when the new dual-pivot quicksort came out a few months ago, it piqued my interest in the quicksort and I decided to try to find a better dual-pivot quicksort. However, in the process of doing so, I realized an improvement for the single-pivot quicksort. In other words, I am not a pro when it comes to C/C++, but I feel like I have a developed a good understanding of quicksort algorithms.
Justin Peel
+8  A: 

If you have truly made a breakthrough and have the math to prove it, you should try to get it published in the Journal of the ACM. It's definitely one of the more prestigious journals for computer science.

The second best would be one of the IEEE journals such as Transactions on Software Engineering.

R Samuel Klatchko