views:

178

answers:

5

How would you fix this code?

template <typename T> void closed_range(T begin, T end)
{
    for (T i = begin; i <= end; ++i) {
        // do something
    }
}
  • T is constrained to be an integer type, can be the wider of such types and can be signed or unsigned

  • begin can be numeric_limits<T>::min()

  • end can be numeric_limits<T>::max() (in which case ++i will overflow in the above code)

I've several ways, but none I really like.

+5  A: 

My take:

// Make sure we have at least one iteration
if (begin <= end)
{
    for (T i = begin; ; ++i)
    {
        // do something

        // Check at the end *before* incrementing so this won't
        // be affected by overflow
        if (i == end)
            break;
    }
}
R Samuel Klatchko
I think it looks a little cleaner as `assign; while(test) {increment; code}`
Jefromi
@Jefromi: assign what, though? You'd need `T i = begin - 1;` which still has overflow problems in the other direction.
Steve Jessop
That's arguable. In this case, `while` emphasizes the break condition, `for` emphasizes the iteration. Either method requires you search for the other half of the equation [you want your increment to be at the end of the loop, by the way, or you'll have changed the semantics of the loop].
Dennis Zickefoose
+6  A: 

Maybe,

template <typename T> void closed_range(T begin, const T end)
    if (begin <= end) {
        do {
            // do something
        } while (begin != end && (++begin, true));
    }
}

Curses, my first attempt was wrong, and the fix above isn't as pretty as I'd hoped. How about:

template <typename T> bool advance(T &value) { ++value; return true; }

template <typename T> void closed_range(T first, const T last)
    if (first <= last) {
        do {
            // do something
        } while (first != last && advance(first));
    }
}

There's no ambiguity with std::advance even if T isn't an integer type, since std::advance takes 2 parameters. So the template would also work with for instance a random-access iterator, if for some reason you wanted a closed range of those.

Or how about a bit of set theory? Obviously this is massive overkill if you're only writing one loop over a closed range, but if it's something that you want to do a lot, then it makes the loop code about right. Not sure about efficiency: in a really tight loop you might want make sure the call to endof is hoisted:

#include <limits>
#include <iostream>

template <typename T>
struct omega {
    T val;
    bool isInfinite;
    operator T() { return val; }
    explicit omega(const T &v) : val(v), isInfinite(false) { }
    omega &operator++() {
        (val == std::numeric_limits<T>::max()) ? isInfinite = true : ++val;
        return *this;
    }
};

template <typename T>
bool operator==(const omega<T> &lhs, const omega<T> &rhs) {
    if (lhs.isInfinite) return rhs.isInfinite;
    return (!rhs.isInfinite) && lhs.val == rhs.val;
}
template <typename T>
bool operator!=(const omega<T> &lhs, const omega<T> &rhs) {
    return !(lhs == rhs);
}

template <typename T>
omega<T> endof(T val) { 
    omega<T> e(val);
    return ++e;
}

template <typename T>
void closed_range(T first, T last) {
    for (omega<T> i(first); i != endof(last); ++i) {
        // do something
        std::cout << i << "\n";
    }
}

int main() {
    closed_range((short)32765, std::numeric_limits<short>::max());
    closed_range((unsigned short)65533, std::numeric_limits<unsigned short>::max());
    closed_range(1, 0);
}

Output:

32765
32766
32767
65533
65534
65535

Be a bit careful using other operators on omega<T> objects. I've only implemented the absolute minimum for the demonstration, and omega<T> implicitly converts to T, so you'll find that you can write expressions which potentially throw away the "infiniteness" of omega objects. You could fix that by declaring (not necessarily defining) a full set of arithmetic operators; or by throwing an exception in the conversion if isInfinite is true; or just don't worry about it on grounds that you can't accidentally convert the result back to an omega, because the constructor is explicit. But for example, omega<int>(2) < endof(2) is true, but omega<int>(INT_MAX) < endof(INT_MAX) is false.

Steve Jessop
+4  A: 

This works and is fairly clear:

T i = begin;
do {
   ...
}
while (i++ < end);

If you want to catch the special case of begin >= end you need to add another if like in Steve Jessop's solution.

sth
+1: good clean solution, and besides, there's a shortage of do-while loops in the world.
Jefromi
The problem with this is that in the case where `T` is `int` and `end` is `INT_MAX`, it increments `i` when `i`'s value is INT_MAX. That's undefined behaviour, even though the value of `i` is never used again.
Steve Jessop
In fairness I should add, that's quite a small problem compared with the problem that the questioner's code has, of being wrong on *all* implementations ;-)
Steve Jessop
+1: Simple and clean. Nice!
Void
@Steve: If i=INT_MAX, is the return value of i++ well defined in this case to be INT_MAX even if the value of "i" is undefined after the operation completes? I ask since I don't know. :)
Void
In general, once you've introduced undefined behavior, all bets are off. Your system could throw an exception on integral overflow, for instance.
Dennis Zickefoose
@Void: what Dennis says. It's not specifically the value of `i` that's undefined, it's the behaviour of the program. Realistically, the likely behaviours based on actual systems is either it harmlessly wraps around, or else a hardware trap occurs. The former is by far the most common, which is why it's only a small problem. It's just that the C++ standard doesn't actually require implementations to do anything sensible.
Steve Jessop
@Dennis, @Steve: I see what you're saying now. Good points.
Void
@Steve, the problem with undefined behaviors is that the optimizer will assume they don't happen, for instance deducing the legal range of values and optimizing before the place where the undefined behavior happen assuming it won't, getting non causal effects.
AProgrammer
@AProgrammer: true in general, but for any specific case like signed integer overflow, implementations can define it (to wrap around, for example), and if so they won't assume it doesn't happen. For example GCC has -fwrapv, and with that option the above code is safe and correct (if not necessarily efficient on rare platforms which don't use 2's complement and don't offer signed integer wraparound instructions)
Steve Jessop
@Steve, yes compilers can do what they want for undefined behavior, even define it. But of all my targets, gcc is the only one who allows to define the overflow behavior. I've lost enough time to track bugs introduced by unknowingly depending on a specific behavior for an UB to not want to add such a dependency knowingly.
AProgrammer
Only GCC has an option for it, but I suspect if we really looked into it we'd discover that MSVC defines wraparound always - either explicitly, or implicitly because it defines 2's complement signed integers, plus some other language to imply that signed and unsigned arithmetic use the same CPU ops. I do agree though, writing portable code is a discipline of its own, and requires you to avoid implementation-specific behaviour. Or at the very least, document it so that new potential targets can easily be checked to see if they support the software.
Steve Jessop
A: 

EDIT: Reworked things to more closely match the OP.

#include <iostream>
using namespace std;

template<typename T> void closed_range(T begin, T end)
{
    for( bool cont = (begin <= end); cont; )
    {
        // do something
        cout << begin << ", ";
        if( begin == end )
            cont = false;
        else
            ++begin;
    }

    // test - this should return the last element
    cout << " --  " << begin;
}
int main()
{
    closed_range(10, 20);
    return 0;
}

The output is:

10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, -- 20

John Dibling
+3  A: 
template <typename T, typename F>
void closed_range(T begin, T end, F functionToPerform) 
{
    for (T i = begin; i != end; ++i) { 
        functionToPerform(i);
    }
    functionToPerform(end);
} 
Bill
You can see it in action here: http://codepad.org/4zctR9w5 (Test harness borrowed from Kirill)
Bill
And here when `end` is `numeric_limits<int>::max()`: http://codepad.org/6qwwFAVd
Bill
Really, you don't need proof that this works. That style of half-open iteration is idiomatic in C++, so there's a lot less room for confusion than in the more complicated examples.
Dennis Zickefoose
@Dennis: Glad to hear it. It's mostly an excuse to play around with codepad, which I've only recently found. :)
Bill
This is good provided you don't need a way for your closed range to be empty. The more complicated examples allow that extra corner case.
Steve Jessop
@Steve: as written, closed_range accepts any forward_iterator, just like the standard algorithms. If you want to restrict the input to just random_access_iterator then you could test `if (begin > end)`. (Because the standard functions only take input_iterators, it will produce bad behavior if you call them with `begin > end`, anyway.) If the caller of closed_range knows they have a random_access_iterator they can do the check themselves, just as they would have to for any other algorithm.
Bill
@Matthieu: For this function, you cannot have an empty range. If you call it with `begin = 1; end = 2`, the function will be called for 1 and 2. If you call it with `begin = 1; end = 1`, the function will be called for 1. If you call it with `begin = 2; end = 1` then you'll get bad behavior, same with the standard library. (In this case, it will loop through the entire range of `int` before halting.)
Bill
True, and Matthieu's proposed fix is forward-iterator-friendly. I don't think "same with the standard library" is a good enough excuse, since one major benefit of half-open ranges is that they allow an empty range with no messing about. The reason that the standard library doesn't allow `begin > end` is because it's meaningless, with a closed range it's meaningless if (and only if) you explicitly ban the empty range (which was a closed set last time I took a topology course).
Steve Jessop
@Steve: Matthieu's proposed fix is forward-iterator friendly, but range-size-one unfriendly. Using only forward-iterators, there's no way to recognize both the empty range and a range of size one. I'd rather cover `1 to max` instead of `0, 2 to max`.
Bill
Sorry, yes, Matthieu's fix isn't really a fix. You're right, forward iterators can't define an empty closed range: you can't in general increment `end`, or detect whether it is safe to do so, so you can't detect that `begin` is "1 past `end`". Maybe the proper solution is to have a general template for forward iterators (which bans empty ranges) and specialisations for random-access iterators and integer types (which allow them). Btw, sorry for so many changes to this comment, I confused myself.
Steve Jessop
Yes... Sorry I am really in a mindset of half-open range :\ I don't think we should worry about forward iterators though. If we were using forward iterators, we would be able to use half-open ranges, the problem here is specific to numbers and the underflow / overflow issues that they face.
Matthieu M.
@Matthieu: All iterators face underflow/overflow issues, though. You could have a vector<T>::iterator at numeric_limits<size_t>::max(), after all.
Bill
It depends on how it is implemented, I typically thought that they were implemented in term of actual pointers and there is no risk of underflow / overflow as long as you stay within the boundaries of the `vector` itself (and one past last).
Matthieu M.
@Matthieu: On my 32-bit system a pointer is the same size as an integer. Any range problems that I worry about for `int` I should worry about for `int*`.
Bill
@Bill: I don't see how you could have a pointer overflow. Unless of course you are doing pointer arithmetic and are not talking of pointers to real objects... but I always considered pointer arithmetic to be a quite specific case and I stray for it as much as I can.
Matthieu M.
@Matthieu: If you can have an integer overflow, and the range of a pointer is the same as an integer, then you can have a pointer overflow. If your iterator can overflow when the underlying representation is an integer, then your iterator can overflow when the underlying representation is a pointer that is the same size as an integer.
Bill