views:

430

answers:

4

The Development version of Django has aggregate functions like Avg, Count, Max, Min, StdDev, Sum, and Variance (link text). Is there a reason Median is missing from the list?

Implementing one seems like it would be easy. Am I missing something? How much are the aggregate functions doing behind the scenes?

+2  A: 

Well, the reason is probably that you need to track all the numbers to calculate median. Avg, Count, Max, Min, StDev, Sum, and Variance can all be calculated with constant storage needs. That is, once you "record" a number you'll never need it again.

FWIW, the variables you need to track are: min, max, count, <n> = avg, <n^2> = avg of the square of the values.

dmo
+2  A: 

A strong possibility is that median is not part of standard SQL.

Also, it requires a sort, making it quite expensive to compute.

S.Lott
There are linear, non sorting, algorithms: http://valis.cs.uiuc.edu/~sariel/research/CG/applets/linear_prog/median.html
Todd Gardner
Wrong algorithm, I meant median of medians: http://en.wikipedia.org/wiki/Selection_algorithm#Linear_general_selection_algorithm_-_.22Median_of_Medians_algorithm.22
Todd Gardner
@Todd Gardner: The first link is the "partition-based general selection" and it's O(nlogn) not linear. The site is wrong. It would be nice to delete that comment, but leave the median-of-medians comment.
S.Lott
+6  A: 

Because median isn't a SQL aggregate. See, for example, the list of PostgreSQL aggregate functions and the list of MySQL aggregate functions.

jacobian
+1  A: 

I have no idea what db backend you are using, but if your db supports another aggregate, or you can find a clever way of doing it, You can probably access it easily by Aggregate.

TokenMacGuy