ansaurus

Question

Fastest code C/C++ to select the median in a set of 27 floating point values

Answer 1

A:

If you want to see algorithms look up the books by Donald E. Knuth.

PS. If you think you have invented something better, then you should be able to show that the complexity is similar or better to the complexity of the known algorithms. Which for variations based on bucket and radix is O(n) on the other hand quick-sort is only O(n.log(n)). A method that is 20% faster is still O(n.log(n)) until you can show the algorithm :-)

Martin York 2009-05-01 08:51:57

Don't forget the constant hidden in the big O. The constant can be big. In my case n is quite small so this complexity evaluation is not interesting. We have to fall back to benchmarking. Using radix sort is a good suggestion as it is the fastest sorting algoritm (earned you a point from me) but requires to initialize and scan a table of 256 values, and this 4 times. So the constant hidden in the big O kills it. But I think the suggestion to use of a hash is a good move. It is referred to in the wikipedia page but I'm considering an optimization in the hash function.

chmike 2009-05-01 10:23:02

Answer 2

A:

If there are 3x3x3=27 possible values (if so why the floats?), can you create an array of 27 elements and count each possibility in one pass through the data?

rstaveley 2009-05-01 09:24:00

There are 27 different float values, not possible values. The values may be any possible float value. But in practice we may expect only positive and quite similar values.

chmike 2009-05-01 09:44:35

Answer 3

+1 A:

I suppose your best bet is to take an existing sorting algorithm and try to figure out whether you can adapt it so that the set does not need to be fully sorted. For determining the median, you need at most half the values sorted, either the lower or higher half would be enough:

original:              | 5 | 1 | 9 | 3 | 3 |
sorted:                | 1 | 3 | 3 | 5 | 9 |
lower half sorted:     | 1 | 3 | 3 | 9 | 5 |
higher half sorted:    | 3 | 1 | 3 | 5 | 9 |

The other half would be a bucket of unsorted values that merely share the property of being larger/smaller or equal to the largest/smallest sorted value.

But I have no ready algorithm for that, it's just an idea of how you might take a short-cut in your sorting.

2009-05-01 09:33:52

I think he already did that with the quick-sort partition function.

quinmars 2009-05-01 09:44:48

@quinmars: yes, that's quite possible.

2009-05-01 09:46:49

Check the wikipedia link I gave. Quick select is using the partition function used inside quick sort. It is much faster than quick sort. I also tested heap sort, but quick sort was faster, and quick select the fastest.The algorithm I "invented" used two heaps with a shared head at the median value position. One heap contains the biggest value at its top and the other the smallest value. I add values one after the other on each side, and keep the heap balanced. The common head value is the median value of the heap values. With an additional optimization it's 20% faster than quick select.

chmike 2009-05-01 10:12:15

@chmike: right, now I get what you meant by that. Makes it even harder to come up with anything that'd improve on that, though :)

2009-05-01 15:59:17

Answer 4

+14 A:

The selection algorithm is linear time (O(n)). Complexity-wise you can't do better than linear time, because it takes linear time to read in all the data. So you couldn't have made something that is faster complexity-wise. Perhaps you have something that is a constant factor faster on certain inputs? I doubt it would make much of a difference.

C++ already includes the linear-time selection algorithm. Why not just use it?

std::vector<YourType>::iterator first = yourContainer.begin();
std::vector<YourType>::iterator last = yourContainer.end();
std::vector<YourType>::iterator middle = first + (last - first) / 2;
std::nth_element(first, middle, last); // can specify comparator as optional 4th arg
YourType median = *middle;

Edit: Technically, that is only the median for a container of odd length. For one of even length, it will get the "upper" median. If you want the traditional definition of median for even length, you might have to run it twice, once for each of the two "middles" at first + (last - first) / 2 and first + (last - first) / 2 - 1 and then average them or something.

newacct 2009-05-01 10:23:13

good, +1. You may wish to ammend the case where there are an even number of elements.

Doug T. 2009-05-01 16:29:55

Thanks to pointing me to this algorithm I didn't knew. Do you know how it is implemented ? Do they use quick select ? I will add this one to my benchmark.

chmike 2009-05-01 18:10:52

Nice one! It would be even better to use std::distance for the (last-first) expression. Then it'll also work for non-contiguous iterators.

xtofl 2009-05-02 04:23:32

How do you fill the vector ? I use memcpy for the other algorithms.Is there a faster method than to use push_back ?Note that initializing the vector is included in the measurement time.

chmike 2009-05-02 06:57:48

Ok, I pre-instantiated and filled the buffer. Then I memcpy( buf.data(), ... ). It appears to be the fastest method so far.

chmike 2009-05-02 07:05:56

@xtofl: yea but nth_element() requires its iterators to be RandomAccessIterators anyway

newacct 2009-05-02 17:18:11

Theoretically your answer is correct. In practice it is possible to do it "faster" than O(n). He can for example traverse the values already earlier. If some helper values would be saved in that moment, later median selection could be "faster" than O(n). This can be seen as general strategy: do not loose any information, to not waste time restoring it later. He can already have part of median selection algorithm "implemented" in his code somewhere!

2009-05-04 06:41:06

@newacct, "Why not just use [the built in C++ library selection algorithm]?" because typically you can easily beat the execution time for the built in C++ sorting algorithms.=====Back in the day I wrote my first sloppy quick-sort that was 2.2 times faster than the built in quick-sort.

Trevor Boyd Smith 2009-05-04 13:47:33

If your vector will always have 27 items, the boost::array class is certainly faster than a std::vector

Stephen Nutt 2009-08-13 02:25:45

Answer 5

+9 A:

The question cannot easily be answered for the simple reason that the performance of one algorithm relative to another depends as much the on compiler / processor / data structure combination as on the algorithm itself, as you surely know

Therefore your approach to try a couple of them seems good enough. And yes, quicksort should be pretty fast. If you haven't done so, you might want to try insertionsort which often performs better on small data sets. This said, just settle on a sorting algo that does the job fast enough. You will typically not get 10-times faster just be picking the "right" algo.

To get substantial speed-ups, the better way frequently is to use more structure. Some ideas that worked for me in the past with large-scale problems:

Can you efficiently pre-calculate while creating the voxels and store 28 instead of 27 floats?
Is an approximate solution good enough? If so, just look at the median of, say 9 values, since "in general it can be expected that values are relatively close." Or you can replace it with the average as long as the values are relatively close.
Do you really need the median for all billions of voxels? Maybe you have an easy test whether you need the median, and can then only calculate for the relevant sub-set.
If nothing else helps: look at the asm code that the compiler generates. You might be able write asm code that is substantially faster (e.g. by doing all the calcs using registers).

Edit: For what it's worth, I have attached the (partial) insertionsort code mentioned in the comment below (totally untested). If numbers[] is an array of size N, and you want the smallest P floats sorted at the beginning of the array, call partial_insertionsort<N, P, float>(numbers);. Hence if you call partial_insertionsort<27, 13, float>(numbers);, numbers[13] will contain the median. To gain additional speed, you would have to unfold the while loop, too. As discussed above, to get really fast, you have to use your knowledge about the data (e.g. is the data already partially sorted? Do you know properties of the distribution of the data? I guess, you get the drift).

template <long i> class Tag{};

template<long i, long N, long P, typename T>
inline void partial_insertionsort_for(T a[], Tag<N>, Tag<i>)
{   long j = i <= P+1 ? i : P+1;  // partial sort
    T temp = a[i];
    a[i] = a[j];       // compiler should optimize this away where possible
    while(temp < a[j - 1] && j > 0)
    { a[j] = a[j - 1];
      j--;}
    a[j] = temp;
    partial_insertionsort_for<i+1,N,P,T>(a,Tag<N>(),Tag<i+1>());}

template<long i, long N, long P, typename T>
inline void partial_insertionsort_for(T a[], Tag<N>, Tag<N>){}

template <long N, long P, typename T>
inline void partial_insertionsort(T a[])
 {partial_insertionsort_for<0,N,P,T>(a, Tag<N>(), Tag<0>());}

stephan 2009-05-01 10:36:55

Thanks, but sorting is overkill. Quick select is much faster. The question is if there is a faster algorithm. Insertion won't be faster than quick select. The dual heap method is a smarter insert method which basically keeps track of the median value while inserting elements by just keeping the heaps balanced.

chmike 2009-05-01 14:14:50

Agree in general, but gave it a quick try before I posted the answer above: a simple insertion sort for 27 integers was faster than std::nth_element which typically is implemented with quickselect (admittedly only tried it with ints). 27 ints / floats should be highly optimized for insertionsort

stephan 2009-05-01 14:39:51

This is counterintuitive. Did yo try with random numbers ? Ints or floats shouldn't make a difference here, I assume. How did you implement it ? Did you do something special ?

chmike 2009-05-01 18:03:34

Took normal insertionsert, templated the array length, templated loop unrolling. Travelling over the weekend. Will post code on Monday when I am back in the office.

stephan 2009-05-02 06:02:07

Answer 6

+4 A:

The most likely algorithm to use in your first attempt is just nth_element; it pretty much gives you what you want directly. Just ask for the 14th element.

On your second attempt, the goal is to take advantage of the fixed data size. You do not wnat to allocate any memory at all duing your algorithm. So, copy your voxel values to a pre-allocated array of 27 elements. Pick a pivot, and copy it to the middle of a 53 element array. Copy the remaining values to either side of the pivot. Here you keep two pointers (float* left = base+25, *right=base+27). There are now three possibilities: the left side is larger, the right side is larger, or the both have 12 elements. The last case is trivial; your pivot is the median. Otherwise, call nth_element on either the left side or the right side. The exact value of Nth depends on how many values were larger or smaller than the pivot. For instance, if the division is 12/14, you need the smallest element bigger than the pivot, so Nth=0, and if the division was 14/12, you need the biggest element smaller the pivot, so Nth=13. The worst cases are 26/0 and 0/26, when your pivot was an extreme, but those happen only in 2/27th of all cases.

The third improvement (or the first, if you have to use C and do not have nth_element) replaces nth_element entirely. You still have the 53 element array, but this time you fill it directly from the voxel values (saving you an interim copy into a float[27]). The pivot in this first iteration is just voxel[0][0][0]. For subsequent iterations, you use a second pre-allocated float[53] (easier if both are the same size) and copy floats between the two. The basic iteration step here is still: copy the pivot to the middle, sort the rest to the left and the right. At the end of each step, you'll know whether the median is smaller or larger than the current pivot, so you can discard the floats bigger or smaller than that pivot. Per iteration, this eliminates between 1 and 12 elements, with an average of 25% of the remaining.

The final iteration, if you still need more speed, is based on the observation that most of your voxels overlap significantly. You pre-calculate for every 3x3x1 slice the median value. Then, when you need an initial pivot for your 3x3x3 voxel cube, you take the median of the the three. You know a priori that there are 9 voxels smaller and 9 voxels larger than that median of medians (4+4+1). So, after the first pivotting step, the worst cases are a 9/17 and a 17/9 split. So, you'd only need to find the 4th or 13th element in a float[17], instead of the 12th or 14th in a float[26].

Background: The idea of copying first a pivot and then the rest of a float[N] to a float[2N-1], using left and right pointers is that you fill a float[N] subarray around the pivot, with all elements smaller than the pivot to the left (lower index) and higher to the right (higher index). Now, if you want the Mth element, you might find yourself lucky and have M-1 elements smaller than the pivot, in which case the pivot is the element you need. If there are more than (M-1) elements smaller than the pivot, the Mth element is amongst them, so you can discard the pivot and anything bigger than the pivot, and seacrh for the Mth element in all the lower values. If there are less than (M-1) elements smaller than the pivot, you're looking for a value higher than the pivot. So, you'll discard the pivot and anything smaller than it. Let the number of elements less than the pivot, i.e. to the left of the pivot be L. In the next iteration, you want the (M-L-1)th element of the (N-L-1)floats that are bigger than the pivot.

This kind of nth_element algorithm is fairly efficient because most of the work is spent copying floats between two small arrays, both of which will be in cache, and because your state is most of the time represented by 3 pointers (source pointer, left destination pointer, right destination pointer).

To show the basic code:

float in[27], out[53];
float pivot = out[26] = in[0];     // pivot
float* left = out+25, right = out+27
for(int i = 1; i != 27; ++1)
if((in[i]<pivot)) *left-- = in[i] else *right++ = in[i];
// Post-condition: The range (left+1, right) is initialized.
// There are 25-(left-out) floats <pivot and (right-out)-27 floats >pivot

MSalters 2009-05-01 15:12:37

The algorithm you describe is quick select with the difference that quick select requires only a vector of 27 elements and works "in place". Pick a pivot, move smaller elements to the left, and bigger elements to the right, adjusting the pivot position. When done, the pivot is at the position it should be when the list is sorted. If it is at position 13, it is the median value. Otherwise the median value is in the biggest set. But your algorithm is faster by avoiding the swaping (movement of pivot) in the first pass. It should be extended to all the pass. This is worth a point.

chmike 2009-05-02 05:26:38

Answer 7

+2 A:

Alex Stepanov's new book Elements of Programming talks at some length about finding order statistics using the minimum number of average comparisons while minimizing runtime overhead. Unfortunately, a sizable amount of code is needed just to compute the median of 5 elements, and even then he gives as a project finding an alternate solution that uses a fraction of a comparison less on average, so I wouldn't dream of extending that framework to finding the median of 27 elements. And the book won't even be available until 15 June 2009. The point is that because this is a fixed-size problem, there is a direct comparison method that is provably optimal.

Also, there is the fact that this algorithm is not being run once in isolation but rather many times, and between most runs only 9 of the 27 values will change. That means in theory some of the work is done already. However, I have not heard of any median filtering algorithms in image processing that take advantage of this fact.

Mark Ruzon 2009-05-01 16:25:16

Answer 8

+2 A:

+1 for everybody who mentioned nth_element, but this kind of code is where hand written algorithm is better than STL because you want to generate the most efficient code for that one compiler running on the one CPU with a specific data set. For example, for some CPU/compiler combination std::swap(int, int) maybe slower than hand written swap using XOR (before you reply, i know this is probably true 20 years ago but not anymore). Sometimes performance is gained by hand writing assembly code specific to your CPU. If you plan to take advantage of GPU's stream processors, you may have to design your algorithm accordingly.

You mentioned using 2 heaps and keep track of the median as you insert. That's what i did a while ago in a project. I changed the array inplace and used only one heap. I could not think of any faster algorithm, but i'd like to caution you about memory usage, specifically CPU cache memory. You want to be careful with memory access. CPU cache is swapped in and out by page, so you want your algorithm to touch memory that are close together to minimize CPU cache miss.

Shing Yip 2009-05-01 17:50:46

Answer 9

A:

I'm betting that you could calculate them for zero cost - in a separate thread while loading from disk (or however they're generated).

What I'm really saying is that 'speed' isn't going to come from bit twiddling because 27 values isn't enough for Big O notation to be a real factor.

Jimmy J 2009-05-02 09:31:36

Answer 10

A:

You might want to have a look at Knuth's Exercise 5.3.3.13. It describes an algorithm due to Floyd that finds the median of n elements using (3/2)n+O(n^(2/3) log n) comparisons, and the constant hidden in the O(·) seems not to be too large in practice.

2009-05-03 07:01:10

This uses the median of median heuristic. Split the N values in groups of 5 values, find their median by sorting each group. Then find the median of the median values. This provides a "good" initial guess of the median value which is then used with the quick select algorithm. It could also be used with the dual heap algorithm that has a smaller stdDev.

chmike 2009-05-04 07:38:16

Answer 11

+8 A:

The benchmark results of the algorithms considered so far

For the protocol and short description of algorithms see below. First value is mean time (seconds) over 200 different sequences and second value is stdDev.

HeapSort     : 2.287 0.2097
QuickSort    : 2.297 0.2713
QuickMedian1 : 0.967 0.3487
HeapMedian1  : 0.858 0.0908
NthElement   : 0.616 0.1866
QuickMedian2 : 1.178 0.4067
HeapMedian2  : 0.597 0.1050
HeapMedian3  : 0.015 0.0049 <-- best

Protocol: generate 27 random floats using random bits obtained from rand(). Apply each algorithm 5 million times in a row (including prior array copy) and compute average and stdDev over 200 random sequences. C++ code compiled with icc -S -O3 and run on Intel E8400 with 8GB DDR3.

Algorithms:

HeapSort : full sort of sequence using heap sort and pick middle value. Naive implementation using subscript access.

QuickSort: full in place sort of sequence using quick sort and pick middle value. Naive implementation using subscript access.

QuickMedian1: quick select algorithm with swapping. Naive implementation using subscript access.

HeapMedian1: in place balanced heap method with prior swapping. Naive implementation using subscript access.

NthElement : uses the nth_element STL algorithm. Data is copied into the vector using memcpy( vct.data(), rndVal, ... );

QuickMedian2: uses quick select algorithm with pointers and copy in two buffers to avoid swaping. Based on proposal of MSalters.

HeapMedian2 : variant of my invented algorithm using dual heaps with shared heads. Left heap has biggest value as head, right has smallest value as head. Initialize with first value as common head and first median value guess. Add subsequent values to left heap if smaller than head, otherwise to right heap, until one of the heap is full. It is full when it contains 14 values. Then consider only the full heap. If its the right heap, for all values bigger than the head, pop head and insert value. Ignore all other values. If its the left heap, for all values smaller than the head, pop head and insert it in heap. Ignore all other values. When all values have been proceeded, the common head is the median value. It uses integer index into array. The version using pointers (64bit) appeared to be nearly twice slower (~1s).

HeapMedian3 : same algorithm as HeapMedian2 but optimized. It uses unsigned char index, avoids value swapping and various other little things. The mean and stdDev values are computed over 1000 random sequences. For nth_element I measured 0.508s and a stdDev of 0.159537 with the same 1000 random sequences. HeapMedian3 is thus 33 time faster than the nth_element stl function. Each returned median value is checked against the median value returned by heapSort and they all match. I doubt a method using hash may be significantly faster.

EDIT 1: This algorithm can be further optimized. The first phase where elements are dispatched in the left or right heap based on the comparison result doesn't need heaps. It is sufficient to simply append elements to two unordered sequences. The phase one stops as soon as one sequence is full, which means it contains 14 elements (including the median value). The second phase starts by heapifying the full sequence and then proceed as described in the HeapMedian3 algorithm. I'll provide the new code and benchmark as soon as possible.

EDIT 2: I implemented and benchmarked the optimized algorithm. But there is no significant performance difference compared heapMedian3. It is even slightly slower on the average. Shown results are confirmed. There might be with much larger sets. Note also that I simply pick the first value as initial median guess. As suggested, one could benefit from the fact that we search a median value in "overlapping" value sets. Using the median of median algorithm would help to pick a much better initial median value guess.

Source code of HeapMedian3

// return the median value in a vector of 27 floats pointed to by a
float heapMedian3( float *a )
{
   float left[14], right[14], median, *p;
   unsigned char nLeft, nRight;

   // pick first value as median candidate
   p = a;
   median = *p++;
   nLeft = nRight = 1;

   for(;;)
   {
       // get next value
       float val = *p++;

       // if value is smaller than median, append to left heap
       if( val < median )
       {
           // move biggest value to the heap top
           unsigned char child = nLeft++, parent = (child - 1) / 2;
           while( parent && val > left[parent] )
           {
               left[child] = left[parent];
               child = parent;
               parent = (parent - 1) / 2;
           }
           left[child] = val;

           // if left heap is full
           if( nLeft == 14 )
           {
               // for each remaining value
               for( unsigned char nVal = 27 - (p - a); nVal; --nVal )
               {
                   // get next value
                   val = *p++;

                   // if value is to be inserted in the left heap
                   if( val < median )
                   {
                       child = left[2] > left[1] ? 2 : 1;
                       if( val >= left[child] )
                           median = val;
                       else
                       {
                           median = left[child];
                           parent = child;
                           child = parent*2 + 1;
                           while( child < 14 )
                           {
                               if( child < 13 && left[child+1] > left[child] )
                                   ++child;
                               if( val >= left[child] )
                                   break;
                               left[parent] = left[child];
                               parent = child;
                               child = parent*2 + 1;
                           }
                           left[parent] = val;
                       }
                   }
               }
               return median;
           }
       }

       // else append to right heap
       else
       {
           // move smallest value to the heap top
           unsigned char child = nRight++, parent = (child - 1) / 2;
           while( parent && val < right[parent] )
           {
               right[child] = right[parent];
               child = parent;
               parent = (parent - 1) / 2;
           }
           right[child] = val;

           // if right heap is full
           if( nRight == 14 )
           {
               // for each remaining value
               for( unsigned char nVal = 27 - (p - a); nVal; --nVal )
               {
                   // get next value
                   val = *p++;

                   // if value is to be inserted in the right heap
                   if( val > median )
                   {
                       child = right[2] < right[1] ? 2 : 1;
                       if( val <= right[child] )
                           median = val;
                       else
                       {
                           median = right[child];
                           parent = child;
                           child = parent*2 + 1;
                           while( child < 14 )
                           {
                               if( child < 13 && right[child+1] < right[child] )
                                   ++child;
                               if( val <= right[child] )
                                   break;
                               right[parent] = right[child];
                               parent = child;
                               child = parent*2 + 1;
                           }
                           right[parent] = val;
                       }
                   }
               }
               return median;
           }
       }
   }
}

chmike 2009-05-05 16:48:20

+1 for the code that. Thanks. I must admit I am puzzled about the speed improvement. On my machine, it is "only" 2-3 times faster than std::nth_element with random numbers. Can you please check.

stephan 2009-05-05 17:28:54

I checked many times. That's also the reason I now check all returned value. The difference may be due to the compiler (icc) or the processor and cache (E8400). The OS is 64bit, so pointers are 64bits. I'll test if replacing p with an index into a is faster. Haven you tried with random floats ?

chmike 2009-05-06 06:42:20

Took float(rand()), so random numbers are not the difference. Great for you that it works. One last thing: you mentioned somewhere that the random numbers are generated by a Poisson process (?) - haye you tried looping over the array and picking an initial Pivot close to the theoretical Median?

stephan 2009-05-06 11:19:08

@chmike Have you published this algorithm? I'm looking for something similar for a median filter that I'm writing for a mobile phone application as a research project. I need to find the median of between 9-100 values. This is for pixel values. It would be fantastic if I could adapt this and try it out; how can I reference your work?

gav 2009-06-05 17:40:48

@gav Thanks for the compliment. No I didn't published it yet. I have a blog and will published it there. I'll give the link here once it is published. Unfortunately, the blog doesn't have pretty code layouts as here. I'ld also try to add it to the wikipedia page on selection algorithm with my real name. On my blog you'll have way to contact me by mail. I would be glad to hear back from your experience with the IPhone.

chmike 2009-06-09 18:54:22

@gav here is the blog note url: http://www.disnetwork.info/1/post/2009/06/median-value-selection-algorithm.html

chmike 2009-06-09 20:12:58

Answer 12

+1 A:

Good alg!!! I have use it and i have improve the calc so much. I used it in an over 2M float elements array. It is good!!

fi.masini 2009-08-13 02:04:09

Thank you very much. For 2M float elements, the hash select is preferable. 1. Make a histogram of the values (cost 1 pass). 2. locate bin containing median value. This is done by summing up bin content from 0 to j, until the sum is bigger than n/2. The last bin contains the median value. 3. build new histogram using previous bin min and max values. 4. iterate until bin contains only one value or has all identical values. This algorithm can be distributed, eventually using GPGPU. Split data set in subsets and distribute histogram generation. Sum histograms to locate median bin and iterate.

chmike 2009-09-06 08:23:52

To avoid scanning all values at each iteration, you could build a linked list of values. An array of integers containing the index of the next value would just do it. The histogram has then for each bin, the index of the first value it contains and a flag set to true if all values of the bin are identical. The flag is initialized to true, and, if true, switched to false when the inserted value is different of the first value. The linked list is initialized during the first pass. To beat the heap select you need a histogram with many bins, for instance 1024.

chmike 2009-09-07 06:09:54

Answer 13

+1 A:

A sorting network generated using the Bose-Nelson algorithm will find the median directly with no loops/recursion using 173 comparisons. If you have the facility to perform comparisions in parallel such as usage of vector-arithmetic instructions then you may be able to group the comparisions into as few as 28 parallel operations.

If you are sure that the floats are normalized and not (qs)NaN's, then you can use integer operations to compare IEEE-754 floats which can perform more favorably on some CPU's.

A direct conversion of this sorting network to C (gcc 4.2) yields a worst-case of 388 clock cycles on my Core i7.

Sorting Networks

matja 2009-09-11 19:55:47

Very interesting suggestion. For the algorithm presented above, we can expect, on average, N/2 comparison for the first phase and, at worse, N/2 log N/2 for the second phase. Rounding logN/2 to 4 and N/2 to 14, I get 70. I'll check the hashSelect.

chmike 2009-09-16 13:37:56

Are you sure that sorting the whole input array is necessary? Would it be possible to reduce the number of comparisions when you only need the median on the right position?

Dadam 2010-10-12 19:49:53

Answer 14

+4 A:

Since it sounds like you're performing a median filter on a large array of volume data, you might want to take a look at the Fast Median and Bilateral Filtering paper from SIGGRAPH 2006. That paper deals with 2D image processing, but you might be able to adapt the algorithm for 3D volumes. If nothing else, it might give you some ideas on how to step back and look at the problem from a slightly different perspective.

Boojum 2009-09-11 20:17:29

Answer 15

A:

When having let's say a million different values from which you need the median. Is it possible to base your median on a subset of those million, let's say 10%. So that the median is close to the n-th element which divides the values in 2 equal (or almost equal) subsets? Therefor, for finding the median you'll need less than O(n)-times (in this case O(1/10n) and hereby come closer to optimal sorting with quicksort in O(nlogn)?

barre 2010-10-06 19:15:40

ansaurus

tags:

views:

answers:

Fastest code C/C++ to select the median in a set of 27 floating point values

related questions