tags:

views:

1203

answers:

2

As part of a test bench I'm building, I'm looking for a simple class to calculate a histogram of integer values (number of iterations taken for an algorithm to solve a problem). The answer should be called something like this:

Histogram my_hist = new Histogram();

for( uint i = 0; i < NUMBER_OF_RESULTS; i++ )
{

    myHist.AddValue( some_result );
}

for( uint j = 0; j < myHist.NumOfBins; j++ )
{
     Console.WriteLine( "{0} occurred {1} times", myHist.BinValues[j], myHist.BinCounts[j] );
}

I was suprised a bit of googling didn't turn up a neat solution but maybe I didn't search for the right things. Is there a generic solution out there or is it worth rolling my own?

+4  A: 

You could use SortedDictionary

uint[] items = new uint[] {5, 6, 1, 2, 3, 1, 5, 2}; // sample data
SortedDictionary<uint, int> histogram = new SortedDictionary<uint, int>();
foreach (uint item in items) {
    if (histogram.ContainsKey(item)) {
        histogram[item]++;
    } else {
        histogram[item] = 1;
    }
}
foreach (KeyValuePair<uint, int> pair in histogram) {
    Console.WriteLine("{0} occurred {1} times", pair.Key, pair.Value);
}

This will leave out empty bins, though

Steef
+1: That looks like a good start. As it happens I'm only interested in bins which contain data :-)
Jon Cage
+2  A: 

Based on BastardSaint's suggestion I came up with a neat and fairly generic wrapper:

public class Histogram<TVal> : SortedDictionary<TVal, uint>
{
    public void IncrementCount(TVal binToIncrement)
    {
        if (ContainsKey(binToIncrement))
        {
            this[binToIncrement]++;
        }
        else
        {
            Add(binToIncrement, 1);
        }
    }
}

So now I can do:

const uint numOfInputDataPoints = 5;
Histogram<uint> hist = new Histogram<uint>();

// Fill the histogram with data
for (uint i = 0; i < numOfInputDataPoints; i++)
{
    // Grab a result from my algorithm
    uint numOfIterationsForSolution = MyAlorithm.Run();

    // Add the number to the histogram
    hist.IncrementCount( numOfIterationsForSolution );
}

// Report the results
foreach (KeyValuePair<uint, uint> histEntry in hist.AsEnumerable())
{
    Console.WriteLine("{0} occurred {1} times", histEntry.Key, histEntry.Value);
}

Took me a while to work out how to make it generic (to begin with I just overrode the SortedDictionary constructor which meant you could only use it for uint keys).

Jon Cage
BastardSaint's method of checking using Contains() is somewhat (a lot) wiser than relying on exceptions. This will give a spike everytime a new number's frequency is getting stored.
Cecil Has a Name
Thinking about it now, perhaps doing the check every time is a better way of checking for existance. I guess it depends whether you're expecting to add lots of very similar items (which I am) or whether you're expecting a histogram with many more unique entires. My hunch was that it would be faster in my case(?)
Jon Cage
Changed the example to use the if-else solution.
Jon Cage