ansaurus

Question

Average function without overflow exception

Answer 1

A:

How about BigInteger in Visual J#.

Darin Dimitrov 2010-05-24 08:03:21

Answer 2

+1 A:

If you know in advance that all your numbers are going to be 'big' (in the sense of 'much nearer long.MaxValue than zero), you can calculate the average of their distance from long.MaxValue, then the average of the numbers is long.MaxValue less that.

However, this approach will fail if (m)any of the numbers are far from long.MaxValue, so it's horses for courses...

AakashM 2010-05-24 08:05:08

This is about the same as my approach, but yours will fail for any negative number.

Tomas Lycken 2010-05-24 08:15:15

Answer 3

A:

If you're willing to sacrifice precision, you could do something like:

long num2 = 0L;
foreach (long num3 in source)
{
    num2 += 1L;
}
if (num2 <= 0L)
{
    throw Error.NoElements();
}
double average = 0;
foreach (long num3 in source)
{
    average += (double)num3 / (double)num2;
}
return average;

Andreas Brinck 2010-05-24 08:09:17

Answer 4

+2 A:

If you know approximately what the average will be (or, at least, that all pairs of numbers will have a max difference < long.MaxValue), you can calculate the average difference from that value instead. I take an example with low numbers, but it works equally well with large ones.

// Let's say numbers cannot exceed 40.
List<int> numbers = new List<int>() { 31 28 24 32 36 29 }; // Average: 30

List<int> diffs = new List<int>();

// This can probably be done more effectively in linq, but to show the idea:
foreach(int number in numbers.Skip(1))
{
    diffs.Add(numbers.First()-number);
}
// diffs now contains { -3 -6 1 5 -2 }

var avgDiff = diffs.Sum() / diffs.Count(); // the average is -1

// To get the average value, just add the average diff to the first value:
var totalAverage = numbers.First()+avgDiff;

You can of course implement this in some way that makes it easier to reuse, for example as an extension method to IEnumerable<long>.

Tomas Lycken 2010-05-24 08:11:37

If you're unlucky to have a list {long.MaxValue, long.MinValue+100, ... }, it still goes awry. But your idea seems nice.

ANeves 2010-05-24 13:53:20

@ANeves - for this to work I explicitly assumed that no two numbers should be longer than long.MaxValue apart.

Tomas Lycken 2010-05-24 15:08:58

Answer 5

A:

Perhaps you can reduce every item by calculating average of adjusted values and then multiply it by the number of elements in collection. However, you'll find a bit different number of of operations on floating point.

var items = new long[] { long.MaxValue - 100, long.MaxValue - 200, long.MaxValue - 300 };
var avg = items.Average(i => i / items.Count()) * items.Count();

Andrei Taptunov 2010-05-24 08:12:11

Answer 6

A:

You could keep a rolling average which you update once for each large number.

filip-fku 2010-05-24 08:13:10

Answer 7

+9 A:

If you're just looking for an arithmetic mean, you can perform the calculation like this:

public static double Mean(this IEnumerable<long> source)
{
    if (source == null)
    {
        throw Error.ArgumentNull("source");
    }

    double count = (double)source.Count();
    double mean = 0D;

    foreach(long x in source)
    {
        mean += (double)x/count;
    }

    return mean;
}

Edit:

In response to comments, there definitely is a loss of precision this way, due to performing numerous divisions and additions. For the values indicated by the question, this should not be a problem, but it should be a consideration.

Programming Hero 2010-05-24 08:18:38

Excellent answer - minimal loss of precision, minimal chance of overflow, and gets the right answer! +1 from me... However: `IEnumerable` doesn't have a `.Count()`, so you should maybe correct your parameter type (or make explicit that you're using Linq). Oh, and nice avatar ;)

Dan Puzey 2010-05-24 09:26:09

@Dan, `IEnumerable` *does* have a `.Count()`, given that you include a `using` statement for `System.Linq`.

Tomas Lycken 2010-05-24 10:53:23

If `count` is very large, and the elements are small, the loss of precision might not be negligible. The more elements you have and the smaller they are, the worse this performs...

Aviad P. 2010-05-24 12:05:07

@Tomas - fair point - I missed the `using` in the OP. He's already had my +1 anyway ;-)

Dan Puzey 2010-05-24 12:11:15

Answer 8

A:

Use the IntX library on CodePlex.

leppie 2010-05-24 08:29:00

Answer 9

+2 A:

Here is how I would do if given this problem. First let's define very simple RationalNumber class, which contains two properties - Dividend and Divisor and an operator for adding two complex numbers. Here is how it looks:

public sealed class RationalNumber
{
    public RationalNumber()
    {
        this.Divisor = 1;
    }


    public static RationalNumberoperator +( RationalNumberc1, RationalNumber c2 )
    {
        RationalNumber result = new RationalNumber();

        Int64 nDividend = ( c1.Dividend * c2.Divisor ) + ( c2.Dividend * c1.Divisor );
        Int64 nDivisor = c1.Divisor * c2.Divisor;
        Int64 nReminder = nDividend % nDivisor;

        if ( nReminder == 0 )
        {
            // The number is whole
            result.Dividend = nDividend / nDivisor;
        }
        else
        {
            Int64 nGreatestCommonDivisor = FindGreatestCommonDivisor( nDividend, nDivisor );

            if ( nGreatestCommonDivisor != 0 )
            {
                nDividend = nDividend / nGreatestCommonDivisor;
                nDivisor = nDivisor / nGreatestCommonDivisor;
            }

            result.Dividend = nDividend;
            result.Divisor = nDivisor;
        }

            return result;
    }


    private static Int64 FindGreatestCommonDivisor( Int64 a, Int64 b)
    {
        Int64 nRemainder;

        while ( b != 0 )
        {
            nRemainder = a% b;
            a = b;
            b = nRemainder;
        }

        return a;
    }


    // a / b = a is devidend, b is devisor
    public Int64 Dividend   { get; set; }
    public Int64 Divisor    { get; set; }
}

Second part is really easy. Let's say we have an array of numbers. Their average is estimated by Sum(Numbers)/Length(Numbers), which is the same as Number[ 0 ] / Length + Number[ 1 ] / Length + ... + Number[ n ] / Length. For to be able to calculate this we will represent each Number[ i ] / Length as a whole number and a rational part ( reminder ). Here is how it looks:

Int64[] aValues = new Int64[] { long.MaxValue - 100, long.MaxValue - 200, long.MaxValue - 300 };

List<RationalNumber> list = new List<RationalNumber>();
Int64 nAverage = 0;

for ( Int32 i = 0; i < aValues.Length; ++i )
{
    Int64 nReminder = aValues[ i ] % aValues.Length;
    Int64 nWhole = aValues[ i ] / aValues.Length;

    nAverage += nWhole;

    if ( nReminder != 0 )
    {
        list.Add( new RationalNumber() { Dividend = nReminder, Divisor = aValues.Length } );
    }
}

RationalNumber rationalTotal = new RationalNumber();

foreach ( var rational in list )
{
    rationalTotal += rational;
}

nAverage = nAverage + ( rationalTotal.Dividend / rationalTotal.Divisor );

At the end we have a list of rational numbers, and a whole number which we sum together and get the average of the sequence without an overflow. Same approach can be taken for any type without an overflow for it, and there is no lost of precision.

EDIT:

Why this works:

Define: A set of numbers.

if Average( A ) = SUM( A ) / LEN( A ) =>

Average( A ) = A[ 0 ] / LEN( A ) + A[ 1 ] / LEN( A ) + A[ 2 ] / LEN( A ) + ..... + A[ N ] / LEN( 2 ) =>

if we define An to be a number that satisfies this: An = X + ( Y / LEN( A ) ), which is essentially so because if you divide A by B we get X with a reminder a rational number ( Y / B ).

=> so

Average( A ) = A1 + A2 + A3 + ... + AN = X1 + X2 + X3 + X4 + ... + Reminder1 + Reminder2 + ...;

Sum the whole parts, and sum the reminders by keeping them in rational number form. In the end we get one whole number and one rational, which summed together gives Average( A ). Depending on what precision you'd like, you apply this only to the rational number at the end.

Ivan Zlatanov 2010-05-24 09:25:35

You are using misleading names (`ComplexNumber`? where's the real and imaginary parts?! - you probably meant `RationalNumber` - `left` and `right` for a GCD function?!). You are using modulos, divisions and the GCD algorithm during addition so I don't understand how this is faster than @Programming Hero's solution. You're not exactly clear about how and why it works either. -1.

IVlad 2010-05-24 10:08:17

I take your criticism and will update my answer. I rechecked my code for testing the speed. My mistake. I will correct my comment.

Ivan Zlatanov 2010-05-24 10:23:58

Answer 10

+2 A:

Simple answer with LINQ...

var data = new[] { int.MaxValue, int.MaxValue, int.MaxValue };
var mean = (int)data.Select(d => (double)d / data.Count()).Sum();

Depending on the size of the set fo data you may want to force data .ToList() or .ToArray() before your process this method so it can't requery count on each pass. (Or you can call it before the .Select(..).Sum().)

Matthew Whited 2010-05-24 10:56:58

Answer 11

+3 A:

You may try the following approach:

let number of elements is N, and numbers are arr[0], .., arr[N-1].

You need to define 2 variables:

mean and remainder.

initially mean = 0, remainder = 0.

at step i you need to change mean and remainder in the following way:

mean += arr[i] / N;
remainder += arr[i] % N;
mean += remainder / N;
remainder %= N;

after N steps you will get correct answer in mean variable and remainder / N will be fractional part of the answer (I am not sure you need it, but anyway)

Miollnyr 2010-05-24 11:05:59

Answer 12

+12 A:

You can compute this almost exactly, without resorting to a larger integer type or big ints. The trick is to keep the remainders separate from the quotients.

public static double Mean(this IEnumerable<Int64> values) {
    Int32 n = values.Count();
    if (n == 0) return 0;
    Int64 q = 0;
    Int64 r = 0;
    foreach (var v in values) {
        q += v / n;
        r += v % n;
    }
    q += r / n;
    r = r % n;
    return q + r / (double)n;
}

This should work as long as there are fewer than Int32.MaxValue items in the collection. Moving the 'q += r/n, r = r % n' lines inside the loop should make it work for even larger collections at the cost of some speed.

You can generalize this to work online (always knowing the average for the values seen so far), using the avg[n+1] = (avg[n]*n+v)/(n+1) formula, but it is much more complicated code.

Strilanc 2010-05-24 11:09:23

+1 for more accuracy while handling any values within the Int64 range and concise code

DanK 2010-05-24 13:16:41

ansaurus

tags:

views:

answers:

Average function without overflow exception

related questions