views:

313

answers:

4

The C standard states:

ISO/IEC 9899:1999, 6.2.5.15 (p. 49)

The three types char, signed char, and unsigned char are collectively called the character types. The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.

And indeed gcc define that according to target platform.

My question is, why does the standard do that? I can see nothing that can come out of ambiguous type definition, except of hideous and hard to spot bugs.

More than so, in ANSI C (before C99), the only byte-sized type is char, so using char for math is sometimes inevitable. So saying "one should never use char for math" is not so true. If that was the case, a saner decision was to include three types "char,ubyte,sbyte".

Is there a reason for that, or is it just some weird backwards-compatibility gotcha, in order to allow bad (but common) compilers to be defined as standard compatible?

A: 

I suppose (out of the top of my head) that their thinking was along the following lines:

If you care about the sign of char (using it as a byte) you should explicitly choose signed or unsigned char.

hasen j
Unsigned came later; signed came a lot later than that.
Jonathan Leffler
+4  A: 

Perhaps historically some implementations' "char" were signed and some were unsigned, and so to be compatible with both they couldn't define it as one or the other.

newacct
Correct. In the current world where every processer is either x86, Power or Sparc its difficult to relaise that in hte 70s there were dozens of processers available with different architectures. From the elegant simpilicity of 8 bit DECs to monster burroughs 36 bit behemoths. Not even the size of a character was agreed on - XEROX machines worked on a 6 bit character set.
James Anderson
Why would the machine care about character? Was there CPU command to output characters? I know none such thing in x86.
Elazar Leibovich
Yes, the reason was historical. And then, since we had unsigned char/signed char/plain char - for symmetry reasons we also have signed int/short - even though the signed for other integer types is redundant. Thus, there is principally the intend to have the sign-ness well defined, but it can't happen anymore for char - too much code would break
Johannes Schaub - litb
+9  A: 

"Plain" char having unspecified signed-ness allows compilers to select whichever representation is more efficient for the target architecture: on some architectures, zero extending a one-byte value to the size of "int" requires less operations (thus making plain char 'unsigned'), while on others the instruction set makes sign-extending more natural, and plain char gets implemented as signed.

Matthew Wightman
You've got it - efficiency was important.
Jonathan Leffler
+1, my thoughts, too
ammoQ
Yup, whatever the hardware provides should be available directly to the language, with minimum sticky sugar on it.
le dorfier
Then why not repeat the same story for unsigned/signed short? it should also be extended to int.
Elazar Leibovich
+1  A: 

in those good old days C was defined, the character world was 7bit, so the sign-bit could be used for other things (like EOF)

Peter Miehle