ansaurus

Question

Quick sort in GLSL?

Answer 1

+5 A:

Have you seen this article? http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter46.html

I wasn't sure you were looking for a Quicksort algorithm or a quick sorting algorithm. The algorithm in the article uses merge sort...

Björn 2009-04-05 12:13:51

Yeah, I think MergeSort makes much more sense to run on a SIMD platform (due to memory localization) than QuickSort.

Mehrdad Afshari 2009-04-05 12:15:53

I was rather looking for a complete sort within one pass because the sorting is only one step in my algorithm which should run for every fragment.

shoosh 2009-04-05 13:43:30

Very good answer. The algorithms in the article are good. Bitonic sorter FTW :-)

ypnos 2009-04-26 08:01:14

Answer 2

+7 A:

Disclosure: I really don't know GLSL -- I've been doing GPGPU programming with the AMD Stream SDK, which has different programming language.

From you comment on Bjorn's answer, I gather that you are not interested in using the GPU to sort a huge database -- like creating a reverse phone book or whatever, but instead, you have a small dataset and each fragment has it's own dataset to sort. More like trying to do median pixel filtering?

I can only say in general:

For small datasets, the sort algorithm really doesn't matter. While people have spent careers worrying about which is the best sort algorithm for very large databases, for small N it really doesn't matter whether you use Quick sort, Heap Sort, Radix Sort, Shell Sort, Optimized Bubble Sort, Unoptimized Bubble sort, etc. At least it doesn't matter much on a CPU.

GPUs are SIMD devices, so they like to have each kernel executing the same operations in lock step. Calculations are cheap but branches are expensive and data-dependent branches where each kernel branchs a different way is very, very, very, expensive.

So if each kernel has it's own small dataset to sort, and the # of data to sort is data dependent and it could be a different number for each kernel, you're probably better off picking a maximum size (if you can), padding the arrays with Infinity or some large number, and having each kernel perform the exact same sort, which would be an unoptimized branchless bubble sort, something like this:

Pseudocode (since I don't know GLSL), sort of 9 points

#define TwoSort(a,b) { tmp = min (a, b); b = a + b - tmp; a = tmp; }
for (size_t n = 8; n ; --n) {
  for (size_t i = 0; i < n; ++i) {
    TwoSort (A[i], A[i+1]);
  }
}

Die in Sente 2009-04-14 20:21:18

Very nice. This is exactly what I was looking for. Do you have any references for the disadvantages of data dependent branches?

shoosh 2009-04-14 22:31:44

I don't have any references off the top of my head. BTW, another reason quicksort won't work on GPUs is they don't support recursion.

Die in Sente 2009-04-15 20:15:29

Answer 3

+1 A:

I haven't got any knowledge about GPU programming.

I'd use heapsort rather than quicksort, because you said you only need to look at the top few values. The heap can be constructed in O(n) time, but getting the top value is log(n). Therefore if you the number of values you need is significantly smaller than the total number of elements you could gain some performance.

Georg 2009-04-26 08:49:57

ansaurus

tags:

views:

answers:

Quick sort in GLSL?

related questions