percentile

Calculating percentile rank in MySQL

Hi, I have a very big table of measurement data in MySQL and I need to compute the percentile rank for each and every one of these values. Oracle appears to have a function called percent_rank but I can't find anything similar for MySQL. Sure I could just brute-force it in Python which I use anyways to populate the table but I suspect t...

How to find Nth percentile with SQLite?

I'll like to find Nth percentile. for example: table: htwt; columns: name, gender, height, weight result: | gender | 90% height | 90% weight | | male | 190 | 90 | | female | 180 | 80 | ...

Percentiles of Live Data Capture

I am looking for an algorithm that determines percentiles for live data capture. For example, consider the development of a server application. The server might have response times as follows: 17 ms 33 ms 52 ms 60 ms 55 ms etc. It is useful to report the 90th percentile response time, 80th percentile response time, etc. The naive alg...

How do I calculate percentiles with python/numpy?

Is there a convenient way to calculate percentiles for a sequence or single-dimensional numpy array? I am looking for something similar to Excel's percentile function. I looked in NumPy's statistics reference, and couldn't find this. All I could find is the median (50th percentile), but not something more specific. ...

Finding 99% coverage in Matlab

i have a matrix in matlab and i need to find the 99% value for each column. That means that value such that 99% of the population has larger value than this. Is there a function in matlab for this? ...

Calculating Percentiles (Ruby).

My code is based on the methods described here and here. def fraction?(number) number - number.truncate end def percentile(param_array, percentage) another_array = param_array.to_a.sort r = percentage.to_f * (param_array.size.to_f - 1) + 1 if r <= 1 then return another_array[0] elsif r >= another_array.size then return anothe...

Calculating percentiles in Excel with "buckets" data instead of the data list itself

I have a bunch of data in Excel that I need to get certain percentile information from. The problem is that instead of having the data set made up of each value, I instead have info on the number of or "bucket" data. For example, imagine that my actual data set looks like this: 1,1,2,2,2,2,3,3,4,4,4 The data set that I have is this:...

Select nth percentile from MySQL

I have a simple table of data, and I'd like to select the row that's at about the 40th percentile from the query. I can do this right now by first querying to find the number of rows and then running another query that sorts and selects the nth row: select count(*) as `total` from mydata; which may return something like 93, 93*0.4 = ...

Fast algorithm for repeated calculation of percentile?

In an algorithm I have to calculate the 75th percentile of a data set whenever I add a value. Right now I am doing this: Get value x Insert x in an already sorted array at the back swap x down until the array is sorted Read the element at position array[array.size * 3/4] Point 3 is O(n), and the rest is O(1), but this is still quite ...

Fast Algorithm for computing percentiles to remove outliers

I have a program that needs to repeatedly compute the approximate percentile (order statistic) of a dataset in order to remove outliers before further processing. I'm currently doing so by sorting the array of values and picking the appropriate element; this is doable, but it's a noticable blip on the profiles despite being a fairly min...