views:

85

answers:

3

I'm self-learning how to create generic functions using iterators. As the Hello World step, I wrote a function to take the mean in the given range and returns the value:

// It is the iterator to access the data, T is the type of the data.
template <class It, class T> 
T mean( It begin, It end ) 
{
    if ( begin == end ) {
        throw domain_error("mean called with empty array");
    }

    T sum = 0;
    int count = 0;
    while ( begin != end ) {
        sum += *begin;
        ++begin;
        ++count;
    }
    return sum / count;
}

My first question is: Is using int for the counter OK, can it overflow if the data is too long?

I call my function from the following test harness:

template <class It, class T> T mean( It begin, It end );

int main() {
    vector<int> v_int;
    v_int.push_back(1);
    v_int.push_back(2);
    v_int.push_back(3);
    v_int.push_back(4);

    cout << "int mean    = " << mean( v_int.begin(), v_int.begin() ) << endl;;

    return 0;
}

When I compile this I get the error:

error: no matching function for call to ‘mean(__gnu_cxx::__normal_iterator<int*,    
std::vector<int, std::allocator<int> > >, __gnu_cxx::__normal_iterator<int*,
std::vector<int, std::allocator<int> > >)’

Thanks!

+2  A: 

My first question is: Is using int for the counter OK, can it overflow if the data is too long?

It's fine, unless you want to support a list of over 2 billion items.

When I compile this I get the error:

The template specialization is unable to deduce the return type T. You have to call it with all template parameters.

mean<vector<int>::const_iterator, double>( v_int.begin(), v_int.begin() )

BTW, in STL, the accumulate() and distance() functions can already compute the sum and count.

#include <numeric>
#include <iterator>

template <class It, class T>
T mean (It begin, It end) {
  if (begin == end)
    throw domain_error("mean called with empty array");

  T zero = 0;
  T sum = std::accumulate(begin, end, zero);
  typename iterator_traits<It>::difference_type count;
  count = std::distance(begin, end);
  return sum / count;
}
KennyTM
If `It` is not a random access iterator, there are two loops. To avoid it, use `for_each` as in my answer.
Alexandre C.
or more concisely: `return std::accumulate(begin,end,T())/std::distance(begin,end);
David Rodríguez - dribeas
+3  A: 
  1. You can use iterator_traits<It>::difference_type instead of int to be sure that it doesn't overflow. This is the type returned by std::distance.

  2. Your compilation error is because the compilator cannot determine the type T

This is because the compilator looks only at the function's declaration first. And if you look only at the declaration, you can't know what T is. As with the first question, you can use iterator_traits.

You can what you want like this:

template <class It> 
typename std::iterator_traits<It>::value_type mean( It begin, It end )
{
    if ( begin == end ) {
        throw domain_error("mean called with empty array");
    }

    typename std::iterator_traits<It>::value_type sum = 0;
    typename std::iterator_traits<It>::difference_type count = 0;
    while ( begin != end ) {
        sum += *begin;
        ++begin;
        ++count;
    }
    return sum / count;
}
Tomaka17
Except you need `typename` here and there...
jpalecek
Fixed, I usually forget these because MSVC++ is very (too much) tolerant
Tomaka17
Using `iterator_traits<It>::value_type` may not be the best option, I would prefer to leave it free (consider obtaining the mean of a vector of integers as a double)
David Rodríguez - dribeas
+2  A: 

This should be as a comment, but comments don't allow formatting code. And I think this should at least be pedagogical. Please refer to the documentation of for_each and unary_function.

I'd write it like this:

#include <algorithm>
#include <functional>
#include <iterator>

template <typename T>
struct accum : std::unary_function<T, void>
{
    void operator()(T const& x)
    { count++; acc += x; }

    // Edited to take care of case count == 0
    T mean() const
    {
        if (count) return acc / count; 
        else throw "Division by zero";
    }

private:
    size_t count;
    T acc;
};


template <template Iter>
typename std::iterator_traits<Iter>::value_type
mean(Iter begin, Iter end)
{
    typedef typename std::iterator_traits<Iter>::value_type T;
    return std::for_each(begin, end, accum<T>()).mean();
}

Usage: mean(v.begin(), v.end()).

It works because for_each returns the functor which has been applied to all the elements in sequence. In our case, the functor accumulated the results, and we can retrieve the mean.

Note that the functor accum can be enhanced to support more elaborated summation schemes, such as Kahan summation algorithm which reduces the roundoff error (there are many such algorithms, some of which totally eliminate it).

Alexandre C.
Excellent information, thanks a lot. I'm studying it to understand some of the more advanced concepts you've used, such as the unary_function. Also, your comment KennyTM's answer was useful, too.
recipriversexclusion