views:

96

answers:

5

Hi,

Is there any built in library that I can use for calculating median in java??

I was working with apache.commons.math for other statistical functions but median was nowhere to be found.

Thank you,

+8  A: 

Put all the numbers into a list, sort the list, and take the middle value (or the average of the two middle values for even size). No calculations necessary

Dave McClelland
ok so sorting it and taking the middle value that is: sortedList[sortedList.lenght/2] should give me the correct median right?
jillika iyer
Jon Freedman
The statistical median is nothing more than the middle value, so yes that should be fine. When there are an even number of values, however, you can't just take the middle since there is no middle. Instead, you should take the average of the two values closest to the middle. In a list of size 4, average the second and third values. Code wise, it would be `(list[list.length/2]+list[(list.length/2)+1])/2`
Dave McClelland
this gives incorrect values : (list[list.length/2]+list[(list.length/2)+1])/2
jillika iyer
@jillika My apologies, I didn't zero base my mental math. Try `(list[list.length/2]+list[(list.length/2)-1])/2` instead
Dave McClelland
+2  A: 

Try the Median object.

duffymo
+4  A: 

Which version of Apache Commmons Math are you using? There is a median class at least in 2.1 onwards (older version i am not sure). You can use it as:

Median median = new Median();
median.evaluate(values);
Ankit
works! great thanks :) i was using 2.0 :P
jillika iyer
What kind of a ridiculous overengineered API design is that? IMHO there's no excuse for something as simple as calulating a median requiring the instantiation of an object and taking more than one line of code.
dsimcha
+1  A: 

Get the values into a List object. Let's suppose the values are integers and the list is called "values". Then

List<Integer> values;
... populate values ...
Collections.sort(values);
int median;
int midpoint=values.size()/2;
if (values.size()%2==1)
  median=values.get(midpoint+1).intValue();
else
  median=(values.get(midpoint).intValue()+values.get(midpoint+1).intValue())/2;

If the number of values is large, like in the hundreds or more, messing with the mod-2 may be technically correct but superfluous.

Maybe there's a more efficient way to do it -- sorting is pretty slow on a large list -- but this would work.

Oh, and you really should check for a list with zero entries. Maybe I'm missing other boundary conditions.

Jay
A: 

From the "too-much-time-on-my-hands" department: here is a little MedianGenerator class:

/**
 * Methods to calculate the median value of a supplied {@link List}.
 */
public final class MedianGenerator{

    private MedianGenerator(){
    }

    /**
     * Calculate the median of a supplied list.
     * <ol>
     * <li>A copy will be generated</li>
     * <li>this copy will be sorted with the supplied comparator</li>
     * <li>the median will be calculated, using the supplied averageCalculator
     * for collections with an even number of items</li>
     * </ol>
     * 
     * @param data 
     * @param comparator
     * @param averageCalculator
     * @return the median
     */
    public static <T> T calculateMedian(final List<T> data,
        final Comparator<? super T> comparator,
        final AverageCalculator<T> averageCalculator){
        final List<T> copy = new ArrayList<T>(data);
        Collections.sort(copy, comparator);
        return doCalculateMedian(data, averageCalculator);

    }

    /**
     * Calculate the median of a supplied list.
     * <ol>
     * <li>A copy will be generated</li>
     * <li>this copy will be sorted with the supplied comparator</li>
     * <li>the median will be calculated, using the {@link #ALWAYS_FIRST} {@link AverageCalculator}
     * for collections with an even number of items</li>
     * </ol>
     * 
     * @param data 
     * @param comparator
     * @return the median
     */
    @SuppressWarnings("unchecked")
    public static <T> T calculateMedian(final List<T> data,
        final Comparator<? super T> comparator){
        return calculateMedian(data, comparator, (AverageCalculator<T>) ALWAYS_FIRST);
    }

    /**
     * Calculate the median of a supplied list.
     * <ol>
     * <li>A copy will be generated</li>
     * <li>this copy will be sorted using natural ordering</li>
     * <li>the median will be calculated, using the {@link #ALWAYS_FIRST} {@link AverageCalculator}
     * for collections with an even number of items</li>
     * </ol>
     * 
     * @param data 
     * @return the median
     */
    @SuppressWarnings("unchecked")
    public static <T extends Comparable<? super T>> T calculateMedian(final List<T> data){
        return calculateMedian(data, (AverageCalculator<T>) ALWAYS_FIRST);
    }

    /**
     * Calculate the median of a supplied list.
     * <ol>
     * <li>A copy will be generated</li>
     * <li>this copy will be sorted using natural ordering</li>
     * <li>the median will be calculated, using the supplied averageCalculator
     * for collections with an even number of items</li>
     * </ol>
     * 
     * @param data
     * @param averageCalculator 
     * @return the median
     */
    public static <T extends Comparable<? super T>> T calculateMedian(final List<T> data,
        final AverageCalculator<T> averageCalculator){
        final List<T> copy = new ArrayList<T>(data);
        Collections.sort(copy);
        return doCalculateMedian(copy, averageCalculator);
    }

    private static <T> T doCalculateMedian(final List<T> sortedData,
        final AverageCalculator<T> averageCalculator){
        T result;
        if(sortedData.isEmpty()){
            result = null;
        } else{
            final int size = sortedData.size();
            if(size % 2 == 0){
                result =
                    averageCalculator.getAverage(sortedData.get(size / 2 - 1),
                        sortedData.get(size / 2));
            } else{
                result = sortedData.get(size / 2 - 1);
            }

        }
        return result;
    }

    /**
     * Generic accessor method for {@link #ALWAYS_FIRST}.
     */
    @SuppressWarnings("unchecked")
    public static <T> AverageCalculator<T> alwaysFirst(){
        return ALWAYS_FIRST;
    }

    /**
     * {@link AverageCalculator} implementation that always returns the lower
     * bound unchanged.
     */
    @SuppressWarnings("rawtypes")
    public static final AverageCalculator ALWAYS_FIRST =
        new AverageCalculator(){

            @Override
            public Object getAverage(final Object first, final Object second){
                return first;
            }

        };

    /**
     * When there is an even number of items, this interface is used to generate
     * the average between the two middle items.
     */
    public static interface AverageCalculator<E> {

        E getAverage(E first, E second);
    }

}
seanizer