median

algorithm for nth_element

I have recently found out that there exists a method called nth_element in the STL. To quote the description: Nth_element is similar to partial_sort, in that it partially orders a range of elements: it arranges the range [first, last) such that the element pointed to by the iterator nth is the same as the element that wou...

How do I find the median of numbers in linear time using heaps?

Wikipedia says: Selection algorithms: Finding the min, max, both the min and max, median, or even the k-th largest element can be done in linear time using heaps. All it says is that it can be done, and not how. Can you give me some start on how this can be done using heaps? ...

How can I calculate data for a boxplot (quartiles, median) in a Rails app on Heroku? (Heroku uses Postgresql)

I'm trying to calculate the data needed to generate a box plot which means I need to figure out the 1st and 3rd Quartiles along with the median. I have found some solutions for doing it in Postgresql however they seem to depend on either PL/Python or PL/R which it seems like Heroku does not have either enabled for their postgresql datab...

Confused about definition of a 'median' when constructing a kd-Tree

Hi there. Im trying to build a kd-tree for searching through a set of points, but am getting confused about the use of 'median' in the wikipedia article. For ease of use, the wikipedia article states the pseudo-code of kd-tree construction as: function kdtree (list of points pointList, int depth) { if pointList is empty retu...

Parallel computation of the median of a large array

I got asked this question once and still haven't been able to figure it out: You have an array of N integers, where N is large, say, a billion. You want to calculate the median value of this array. Assume you have m+1 machines (m workers, one master) to distribute the job to. How would you go about doing this? Since the median is a no...

Excel 2007 MedianIfs()

I want to calculate some statistics. In order to calculate the average of certain values of a column, I use AverageIfs(). Now I want to calculate the median for the same values. But there is no MedianIfs() function. Is there a simple solution to calculate the median for values that hold certain conditions (2 conditions)? ...

How to calculate median of a Map<Int,Int>?

For a map where the key represents a number of a sequence and the value the count how often this number appeared in the squence, how would an implementation of an algorithm in java look like to calculate the median? For example: 1,1,2,2,2,2,3,3,3,4,5,6,6,6,7,7 in a map: Map<Int,Int> map = ... map.put(1,2) map.put(2,4) map.put(3,3) m...

When to use geometric vs arithmetic mean?

So I guess this isn't technically a code question, but it's something that I'm sure will come up for other folks as well as myself while writing code, so hopefully it's still a good one to post on SO. The Google has directed me to plenty of nice lengthy explanations of when to use one or the other as regards financial numbers, and thing...

Medians of upper and lower halves of a vector

I am trying to compile an Octave .oct function to calculate the medians of the upper and lower "halves" of a sorted vector which will vary in length e.g. for an odd length vector such as [5,8,4,6,7] I want the "lower" median value of 4,5 and 6 and the "upper" median value of 6,7 and 8 (6 is part of both calculations), and for an even len...

Compute median of column in SQL common table expression

In MSSQL2008, I am trying to compute the median of a column of numbers from a common table expression using the classic median query as follows: WITH cte AS ( SELECT number FROM table ) SELECT cte.*, (SELECT (SELECT ( (SELECT TOP 1 cte.number FROM (SELECT TOP 50 PERCENT cte.number FROM cte ...

Incremental median computation with max memory efficiency

I have a process that generates values and that I observe. When the process terminates, I want to compute the median of those values. If I had to compute the mean, I could just store the sum and the number of generated values and thus have O(1) memory requirement. How about the median? Is there a way to save on the obvious O(n) coming f...

Using a conditional statement in Microsoft Excel

I am trying to find the median of some prices whereby another column matches, ie, Prices Type of product 1 Bananas 4 Peas 9 Bananas 20 Beans 5 Bananas 90 Apples I know how to pull the median price for all of them as a group, but I need...

How to learn if a value is even or odd in bash?

I am building a movie database and I need to find a median for ratings. I'm really new to bash (it's my first assignment). I wrote: let evencheck=$"(($Amount_of_movies-$Amount_of_0_movies)%2)" if [ $evencheck==0 ] then let median="(($Amount_of_movies-$Amount_of_0_movies)/2)" else let median="(($Amount_of_movies-$Amount_of_0_movies)/2...

Median Algorithm in O(log n)

How can we remove the median of a set with time complexity O(log n)? Some idea? ...

How To Get the Median Of n odd elements using quicksort?

Can anyone explain how to improve the quicksort algorithm for finding the median of n odd numbers and what will be the worst case scenario for that algorithm? Please help. ...

Standard sorting networks for small values of n

I'm looking for a sorting network implementation of a 5-element sort, but since I couldn't find a good reference on SO, I'd like to ask for sorting networks for all small values of n, at least n=3 through n=6 but higher values would be great too. A good answer should at least list them as sequences of "swap" (sort on 2 elements) operatio...

Optimal median of medians selection - 3 element blocks vs 5 element blocks?

I'm working on a quicksort-variant implementation based on the Select algorithm for choosing a good pivot element. Conventional wisdom seems to be to divide the array into 5-element blocks, take the median of each, and then recursively apply the same blocking approach to the resulting medians to get a "median of medians". What's confusi...

Taking median of calculation in SQL Server

MyTable in SQL Server contains _TimeStamp, Column1, Column2, and Column3, which have the following values: _TimeStamp Column1 Column2 Column3 '2010-10-11 15:55:25.40' 10 3 0.5 '2010-10-11 15:55:25.50' 20 9 0.7 '2010-10-11 15:55:25.60' 15 2 1.3 '2010-10-11 15:55:25.70' 17 8 2.7 '2010-10-11 15:55:25.80' 42 6 3.6 '2...

Calculate median for each subject with update on ties? [R]

I have data which looks like this (this is test data for illustration): test <- matrix(c(1, 1, 1, 2, 2, 2 , 529, 528, 528, 495, 525, 510,557, 535, 313,502,474, 487 ), nr=6, dimnames=list(c(1,2,3,4,5,6),c("subject", "rt1", "rt2"))) And I need to turn it into this: test2<-matrix(c(1,1,1,2,2,2,529,528,528,495,525,510,"slow","slow","fa...