tags:

views:

37

answers:

1

With Boost's accumulators I can easily calculate statistical quantities for weighted or unweighted input sets. I wonder if it is possible to mix weighted and unweighted quantities inside the same accumulator. Looking at the docs it doesn't seem that way.

This compiles fine but produces another result than I would have liked:

using namespace boost::accumulators;

const double a[] = {1, 1, 1, 1, 1, 2, 2, 2, 2};
const double w[] = {1, 2, 3, 4, 5, 6, 7, 8, 9};

accumulator_set<double, features<tag::sum, tag::weighted_sum>, double> stats;
for (size_t i=0; i<9; ++i)
  stats(a[i], weight = w[i]);

std::cout << sum(stats) <<" "<< weighted_sum(stats) << std::endl;
// outputs "75 75" instead of "13 75"

Also, with a third template parameter to accumulator_set I always seems to get weighted quantities, even when using an "unweighted" feature and extractor:

accumulator_set<double, features<tag::sum>, double> stats;
for (size_t i=0; i<9; ++i)
  stats(a[i], weight = w[i]);
std::cout << sum(stats) << std::endl;
// outputs "75" instead of 13

Do I always have to use two different accumulators if I want to calculate both weighted and unweighted quantities?

EDIT I just use sum as an example, in reality I am interested in multiple, more complicated quantities.

A: 

It does say in the documentation that

When you specify a weight, all the accumulators in the set are replaced with their weighted equivalents.

There are probably better ways to do it but you can try something like this (basically swapping the meaning of the value with that of the weight):

accumulator_set< double, stats< tag::sum, tag::sum_of_weights >, double > acc;
const double a[] = {1, 1, 1, 1, 1, 2, 2, 2, 2};
const double w[] = {1, 2, 3, 4, 5, 6, 7, 8, 9};

   for( int i = 0; i < sizeof( a ) / sizeof( a[ 0 ] ); i++ )
      acc( w[ i ], weight = a[ i ] );

   std::cout << extract_result< tag::sum >( acc ) << std::endl; // weighted sum, prints 75
   std::cout << extract_result< tag::sum_of_weights >( acc ) << std::endl; // sum, prints 13
Eugen Constantin Dinca
Thanks! Yes, that would work for the case of `sum`. I didn't make it clear enough it was just an example, when I am really interested in extraction other, more complicated quantities, too (like variance, median and mean and their weighted counterparts). For a sum one probably wouldn't use an accumulator to begin with.
honk
I am accepting this answer since it points out the obvious point from the docs. I have still no idea if there is some nasty templating trick to do what I wanted, but from the docs it seems like that would need a dirty hack instead of something directly supported by `accumulator_set`'s.
honk