ansaurus

Question

Is there any way to find arithmetic mean "better" than sum()/N?

Answer 1

+3 A:

If the big integers are problem... is it ok

a/N + b/N+.... n/N

I mean you're looking just for other ways or the optimal way?

Svetlozar Angelov 2009-08-28 13:01:07

why?!?! If a,b,etc are integers this will give you an incorrect answer. If they are floating point, I'm not sure but my hunch is you will get more rounding errors than if you just performed a sum and then divided. In either case the computation time is increased greatly for a questionable benefit.

Jason S 2009-08-28 13:35:22

Answer 2

+1 A:

If you use float you might avoid big integers:

def simple_mean(array[N]):
    sum = 0.0 # <---
    for i = 1 to N
       sum += array[i]
    return sum / N

Nick D 2009-08-28 13:09:09

Answer 3

+17 A:

Knuth lists the following method for calculating mean and standard deviation given floating point (original on p. 232 of Vol 2 of The Art of Computer Programming, 1998 edition; my adaptation below avoids special-casing the first iteration):

double M=0, S=0;

for (int i = 0; i < N; ++i)
{
    double Mprev = M;
    M += (x[i] - M)/(i+1);
    S += (x[i] - M)*(x[i] - Mprev);
}

// mean = M
// std dev = sqrt(S/N) or sqrt(S/N+1)
// depending on whether you want population or sample std dev

Jason S 2009-08-28 13:09:55

Answer 4

+8 A:

Here's a way to calculate the mean using just integers without rounding errors and avoiding big intermediate values:

sum = 0
rest = 0
for num in numbers:
  sum += num / N
  rest += num % N
  sum += rest / N
  rest = rest % N

return sum, rest

sepp2k 2009-08-28 13:22:39

+1 Very clever!

Martin B 2009-08-28 13:29:26

This basically uses multiprecision (dual word) arithmetic. I think there's a way to optimize this to get the number of divide-like (/ or %) operations down, but I can't remember off the top of my head.

Jason S 2009-08-28 13:37:04

The usual technique is to calculate X/N and X%N in a single function/single operation. This is because the underlying algorithms are pretty much the same.

MSalters 2009-08-28 14:14:23

yes although C doesn't expose them. >:(No, I meant more like: sum += (num + rest) / N; rest = (num + rest) % N;except that may be prone to overflow

Jason S 2009-08-28 17:04:20

C doesn't expose div/mod with a single function or operator, but most good compilers will generate the instruction/call if you use both.

Stephen Canon 2009-08-28 17:34:57

Answer 5

+2 A:

If the array is floating-point data, even the "simple" algorithm suffers from rounding error. Interestingly, in that case, blocking the computation into sqrt(N) sums of length sqrt(N) actually reduces the error in the average case (even though the same number of floating-point roundings are performed).

For integer data, note that you don't need general "big integers"; if you have fewer than 4 billion elements in your array (likely), you only need an integer type 32 bits larger than that the type of the array data. Performing addition on this slightly larger type will pretty much always be faster than doing division or modulus on the type itself. For example, on most 32 bit systems, 64-bit addition is faster than 32-bit division/modulus. This effect will only become more exaggerated as the types become larger.

Stephen Canon 2009-08-28 16:21:51

ansaurus

tags:

views:

answers:

Is there any way to find arithmetic mean "better" than sum()/N?

related questions