views:

140

answers:

2

My problem is more complex than this, so I've narrowed it down to a very simple example that would show me enough to know how to handle the rest.

Say I have an input iterator. I want make a new input iterator derived from it, where each element is the combination of multiple sequential elements of the original input with the following pattern. The run length is encoded in the input sequence.

Input: { 1 1 2 3 4 4 6 7 8 9 ... }

Output: { (1) (3+4) (6+7+8+9) ... }

I was thinking a function like this could process a single element and increment the input begin iterator (passed by reference). There are a few questions in my comments, plus I'd like to know if there's a good way to do it for the entire stream of elements.

EDIT: I'm aware there's a bug in the call to std::advance where the tmp iterator is incremented to be exactly end, which would be valid for this code. Let's focus on the rest of my questions and I'll fix that. Edit 2: should be fixed now?

template<class TInputIterator, class TOutputIterator>
void process_single(TInputIterator& begin, TInputIterator end, TOutputIterator destination)
{
    std::iterator_traits<TInputIterator>::value_type run_length = *begin;
    ++begin;

    // is there a better way to specify run_length elements to accumulate() without having to call advance() here?
    TInputIterator tmp(begin);
    std::advance(tmp, run_length);
    // Edited: this condition should work for the different kinds of iterators?
    if ((end < tmp) || (std::distance(begin, tmp) != run_length))
        throw std::range_error("The input sequence had too few elements.");

    // std::plus is the default accumulate function
    *destination = std::accumulate(begin, tmp, 0/*, std::plus<TInputIterator::value_type>()*/);

    // should I use std::swap(begin, tmp) here instead?
    begin = tmp;
}

Edit 3: In response to the answers, would this be better?

template<class TInputIterator, class TOutputIterator>
TInputIterator process_single(TInputIterator begin, TInputIterator end, TOutputIterator destination)
{
    typedef std::iterator_traits<TInputIterator>::value_type value_type;

    value_type run_length = *begin;
    ++begin;

    value_type sum = 0;
    while (run_length > 0 && begin != end)
    {
        sum += *begin;
        ++begin;
        --run_length;
    }

    if (run_length)
    {
        throw std::range_error("The input sequence had too few elements.");
    }

    *destination = sum;

    return begin;
}

template<class TInputIterator, class TOutputIterator>
void process(TInputIterator begin, TInputIterator end, TOutputIterator destination)
{
    while (begin != end)
    {
        begin = process_single(begin, end, destination);
    }
}
A: 

// is there a better way to specify run_length elements to accumulate() without having to call advance() here?

Not really.

// Edited: this condition should work for the different kinds of iterators?
if ((end < tmp) || (std::distance(begin, tmp) != run_length))
    throw std::range_error("The input sequence had too few elements.");

The problem here is the < operator, which is only going to work for RandomAccessIterators. Why not just:

if (std::distance(tmp, end) < run_length)

?

// should I use std::swap(begin, tmp) here instead?
begin = tmp;

Nope.

EDIT: I'm aware there's a bug in the call to std::advance where the tmp iterator is incremented to be exactly end, which would be valid for this code. Let's focus on the rest of my questions and I'll fix that.

Incrementing to end is standard behavior for STL algorithms.

void process_single(TInputIterator& begin, TInputIterator end, TOutputIterator destination)

STL iterators aren't generally a good type to pass byref. Callers all too often want to preserve them after the call to your function. For example, passing byRef causes this not to compile:

std::vector<something> t;
std::vector<something> t2;
process_single(t.begin(), t.end(), std::back_inserter(t2))

(Many compilers will take it but it's not standard)

Better would be to pass the iterator byval and then return the new position at which you end your algorithm, to be more consistent with the rest of the STL. For example, see std::find().

Hope that helps....

Billy ONeal
+2  A: 

I would write this algorithm manually.

Firstly, the function does not accept an input iterator, because those don't support advance and distance.

Secondly, the error checking is off. If I'm not mistaken, the possibility of end < tmp means some undefined behaviour has been invoked. Imagine the container is a std::list. What would happen if you managed to advance beyong list.end()? But I think it would be undefined even with a vector or array (and MSVC++ would probably kick in with its iterator debugging before you).

So, to decode the whole sequence, I'd do something like this:

#include <iostream>
#include <algorithm>
#include <vector>
#include <stdexcept>
#include <iterator>

template <class InputIterator, class OutputIterator>
void decode(InputIterator start, InputIterator end, OutputIterator output)
{
    typedef typename std::iterator_traits<InputIterator>::value_type value_type;
    while (start != end)
    {
        value_type count = *start;
        ++start;
        value_type result = value_type();
        for (value_type i = value_type(); i != count; ++i, ++start) {
            if (start == end) {
                throw std::range_error("The input sequence had too few elements.");
            }
            result += *start;
        }
        *output = result;
        ++output;
    }
}

int main()
{
    try {
        std::vector<int> v;
        decode(std::istream_iterator<int>(std::cin), std::istream_iterator<int>(), std::back_inserter(v));
        std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout, " "));
    }
    catch (const std::exception& e) {
        std::cout << e.what() << '\n';
    }
}
UncleBens
Nice answer. +1 But... end < tmp is okay. It works only for RandomAccessIterators though...
Billy ONeal
For some iterators like `istream_iterator` incrementing when at the end leaves you at the end. Others, like pointers, keep going indefinitely. Another way to handle the situation is advance `run_length-1` and make sure that's not the end, then go one more.
280Z28
This is entirely implementation-dependent. With VC++ incrementing beyond `std::istream_iterator<T>()` triggers an error with checked iterators, as well as advancing a vector iterator beyond end(). No algorithm should ever advance beyond the end iterator. Also, `it + run_length - 1` has no less potential being out of bounds than `it + run_length`.
UncleBens