views:

130

answers:

2

EDIT Public health warning - this question includes a false assumption about undefined behaviour. See accepted answer.

After a reading recent blog post, I've been thinking a lot about the practicality of avoiding all standards-undefined assumptions in C and C++ code. Here is a snippet cut out of C++, to do an unsigned 128-bit addition...

void c_UInt64_Pair::operator+= (const c_UInt64_Pair &p)
{
  m_Low  += p.m_Low;
  m_High += p.m_High;

  if (m_Low < p.m_Low)  m_High++;
}

This clearly relies on assumptions about overflow behaviour. Obviously most machines can support a binary integer of the right kind (though perhaps building from 32-bit chunks or whatever), but there's apparently a growing chance that the optimiser may exploit the standards-undefined behaviour here. That is, the only way that the m_Low < p.m_Low condition can pass is if m_Low += p.m_Low overflows, which is undefined behaviour, so the optimiser can legally decide that the condition always fails. In which case, this code is simply broken.

The question is, therefore...

How can you write a reasonably efficient version of the above without relying on undefined behaviour?

Assume that you have an appropriate 64-bit binary machine integer, but that you have a malicious compiler that will always interpret your undefined behaviour in the worst possible (or impossible) way. Also, assume that you don't have some special built-in, intrinsic, library or whatever to do it for you.

EDIT minor clarification - this isn't just about detecting overflow, but also ensuring that both m_Low and m_High end up with the correct modulo 2^64 results, which is also standards-undefined.

A: 

If you want an actually efficient method, you'll have to code in something other than C or C++. For reasonably efficient, you have to ensure that overflow never happens, and detect and compensate for when it would have.

Basically, for each 64-bit component, you need to separately calculate the additions using the low 63 bits, and the highest bits. From these separate calculations you can work out what the 64-bit total was, and if there was a carry.

Then when you do the upper 64-bit add, you add in the carry, if there is one. If a carry results from that, then you've overflowed your 128-bit variable, and you'll need to trigger an exception, or otherwise handle the case.

swestrup
Nope; what you say is true for signed integral types, but not unsigned.
David Thornley
Aannd... why would you not be able to efficiently do this in C or C++? What alternative would you suggest would be *more* efficient?
Billy ONeal
Most efficient would be to actually use the hardware facilities provided for just this purpose, be that 128-bit registers or carry flags.
swestrup
But I do stand corrected on the whole signed/unsigned thing. For some reason I seemed to recall that all arithmetic overflows were undefined.
swestrup
@swestrup - I was fooled too :-(
Steve314
@David - that sounds true pedantically (for signed integral types) but not practically (in that you can use unsigned integral types to do signed arithmetic). However, that may well break for anything but addition/subtraction.
Steve314
+7  A: 

From the C++ 1998 standard, 3.9.1(4): "Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2^n where n is the number of bits in the value representation of that particular size of integer." Note that "integer", here, refers to any integer type rather than just int.

Therefore, assuming that those are unsigned integers, like the "UInt64" in the type suggests, this is defined behavior in C++ and should work as expected.

David Thornley
What is the C++ 1995 standard? I am aware of '99, '03... never heard of that one. In any case, while that is true, that does not mean that overflow behavior is defined, see the section I quoted above -- "If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined."
Billy ONeal
@Billy, since unsigned types follow rules of modulo arithmetic, there will never be a result that's outside the range of representable values, so the behavior is defined. The footnote on 3.9.1/4 says as much.
Rob Kennedy
@Rob: Ah.. the footnote does clarify that. I would contend though that the passage he cited does not make that inherently obvious.
Billy ONeal
@Billy ONeal: Let's assume that the cite is correct for C++. In any case the same modulo rule applies to C99. Arithmetic modulo 2^n is always well defined, there is no ambiguity about the results: they are always in the range `0 <= x < 2^n`. So the extra clause about mathematically undefined behavior can never apply to unsigned types. Unsigned integer types wrap, but they don't overflow.
Jens Gustedt
@Billy: Changed to 1998. As far as I know, there never was a 1999 standard, that's C. I don't have the 2003 version handy, and have no plans to get one. Further, specifying that unsigned obeys the laws of modular arithmetic is defining the behavior, and overflows are well defined in modular arithmetic.
David Thornley
You might also cite the C standard.
R..