views:

774

answers:

6

I just spent an hour tracking an integer overflow in my code, so I decided to share that -most people on here will know that of course, but perhaps I can save someone some time. So, here's the question:

Should I use unsigned integers for my count class members?

Answer

For example, assume a class

TList <T> = class
private
  FCount : Cardinal;
public
  property Count : Cardinal read FCount;
end;

That does make sense, doesn't it? The number of items stored in a list can't be negative, so why not use an unsigned integer type for it? I think it's in general a good principle to always use the least general (ergo the most special) type possible.

Now, iterating over a list looks like this:

for I := 0 to List.Count - 1 do
  Writeln (List [I]);

When the number of items stored in the list is zero, the compiler tries to evaluate

List.Count - 1

which results in a nice Integer overflow (underflow to be exact). Combined with the fact that the debugger does not show the appropriate location where the exception occured, this was very hard to find for me.

Let me add that if you have overflow checking turned off, the resulting errors will be even harder to track, because then you will often access memory that doesn't belong to you - and that results in undefined behaviour.

I will be using plain Integers for all my count members from now on to avoid situations like this.

If that's complete nonsense, please point it out to me :)

+2  A: 

Moral: use iterators and foreach when you can, because it avoids this question altogether.

RichieHindle
That's probably true, but in some cases especially in often used container classes or in performance critical section I just don't want the overhead of an enumerator.
Smasher
Foreach? Perhaps you're thinking of a different language.
Rob Kennedy
@Richie: I believe you mean enumerators and for..in? @Smasher: You can decrease the "weight" of an enumerator dramatically by declaring it as a record instead of a class and marking the GetCurrent method as inline. ( http://hallvards.blogspot.com/2007/10/more-fun-with-enumerators.html )
Mason Wheeler
I'm afraid Rob was right - I was thinking of .net languages. But if there's an equivalent in Delphi, then that's what I meant too. 8-) (Richie makes a note to pay more attention in future)
RichieHindle
@Mason: I didn't know that, thanks for the hint!
Smasher
+2  A: 

Boundary conditions frequently present problems. Allowing for a type that can go negative may just shift the issue. Perhaps it shifts it in a way that's easier to debug, perhaps not. I started off using integers for counting loops like that, but later on switched to cardinals to help me catch errors.

Brian Knoblauch
Hmm...in my opinion the scenario I described is not really an error...everything works fine when using signed integers, so how does using signed integers help you to catch errors?
Smasher
If it's something that theoretically can't ever go negative, but it *is* negative, then you know you have a problem somewhere that needs to be solved!
Brian Knoblauch
+8  A: 

Regarding your statement that "I think it's in general a good principle to always use the least general (ergo the most special) type possible." - actually I think it's a good principle to use the data type that will cause you least angst and trouble.

Generally for me that's a signed int since:

  1. I don't usually have lists with 231 or more elements in them.
  2. You shouldn't have lists that big either :-)
  3. I don't like the hassle of having special edge cases in my code.

But it's really a style issue. If 'purity' of code is more important to you than brevity of code, your method is best (with modifications to catch the edge cases). Myself, I prefer brevity since edge cases tend to clutter the code and reduce understanding.

paxdiablo
+14  A: 

No, definitely not. Delphi idiom is to use integers here. Don't fight the language. In a 32 bit environment you'll not have more elements in the list, except if you try to build a bitmap.

Let's be clear: every programmer who is going to have to use your code is going to hate you for using a Cardinal instead of an integer.

Stephan Eggermont
+1 I didn't know about this idiom, so thanks for pointing that out
Smasher
btw: is there a specific reason for that "idiom"?
Smasher
Probably a Delphi guy doing exactly what you're trying to do and concluding: "Hey, this sucks, big time". :-)
paxdiablo
TList is part of the VCL, and has always had a count: integer. There was no explanation, but I guess Pax is right
Stephan Eggermont
For a long time, there was no unsigned 32-bit type in Delphi. If there could be more than 65535 items, you *needed* LongInt. There was no LongWord. I think even Cardinal was a signed type for a version or two.
Rob Kennedy
According to http://www.drbob42.com/delphi/porting.htm Cardinal was unsigned 31 bit in Delphi 2. 16 bits is Delphi 1, on Windows 3. Somewhere I have a CD...
Stephan Eggermont
There are several reasons for the idiom - 1. MaxListSize = Maxint div 16, so it can't ever NEED an unsigned integer. 2. IndexOf() type functions return -1 if an item isn't found, so they need to be integers. Lists can't be bigger than available memory!
Gerry
"every programmer who is going to have to use your code is going to hate you for using a Cardinal instead of an integer" /agreed
Ian Boyd
And for not using range checking to make your code safe. In which case it doesn't matter.
Marco van de Voort
+8  A: 

Unsigned integers are almost always more trouble than they're worth because you usually end up mixing signed and unsigned integers in expressions at some point. That means that the type will need to be widened (and probably have a performance hit) to get correct semantics (ideally the compiler does this as per language definition), or else you'll need to be very careful in your range checking.

Take C/C++ for example: size_t is the type of the integer for memory sizes and allocation, and is unsigned, but ptrdiff_t is the type for the offset you get when you subtract one pointer from another, and necessarily is signed. Want to know how many elements you've allocated in an array? Perhaps you subtract the first element address from the last+1 element address and divide by sizeof(element-type)? Well, now you've just mixed signed and unsigned integers.

Barry Kelly
+8  A: 

Don't.

It's not just going against a programming idiom, it's an explicit request to the compiler to use unsigned arithmetic, that is either prone to anomalous behaviors (if you don't guard against overflows) or to irrelevant runtime exceptions (if you guard against overflows, a temporary overflow will be fatal, f.i when you subtract before adding, even if the final result is positive, and I'm referring to the CPU opcode-level ordering of operations, which may not bear a trivial relationship to what you have in your code).

Keep in mind "unsigned" does not translate to "positive", it translates to "doesn't have a sign", which is different. The term "unsigned" was picked for good reason (and naming it "Cardinal" in Delphi was a poor choice IMO).

Unsigned types are for raw storage specifications, bitwise operations, ASM code, embedded controllers and other specialty uses. When you're doing high-level programming, you should forget you ever heard about unsigned types.

Eric Grange