I realise you can just #define
some integers, but why didn't C have a dedicated boolean data type before C99?
It's such a common occurence in programming and logic, I don't understand the absense of an explicit type and notation.
I realise you can just #define
some integers, but why didn't C have a dedicated boolean data type before C99?
It's such a common occurence in programming and logic, I don't understand the absense of an explicit type and notation.
Because they did not put one in. Sorry if that sounds snippish, but basically it was not defined as such.
Remember most people #define TRUE and FALSE.
You may say bool IS standard - but obviously it WAS not standard before C99 - which was made 10 years ago ;) They added it then when it became obvious a missing item.
Because none can foresee everything including missing data type in a programming language.
C is actually little more than a higher-level assembly language. Yes, it got control structures and whatnot and it even got types which assembler certainly doesn't need.
But the language was designed decades ago. And since every boolean result gets down to individual bits in the processor's status word it obviously was sufficient to just using an integral data type for it. And it made the compiler probably a little less complex since you can omit some type checking (in later languages control structures need a boolean value, in C they just need an integral value of either 0 or something else).
I suspect it was deemed sufficient to have an integer type, with 0 being false and anything not 0 true.
A CPU has no "boolean type", they only work on bytes and multiples of them so a boolean type made no sense at that time as it didn't give an advantage (why use a type when you can only check "is 0" or "is not null")
It was common (and still is in some cases) to treat zero as false and any non-zero as true. This has advantages for shorthand: for example, instead of while (remaining != 0)
you can just use while (remaining)
.
Some languages standardised on true being -1. The reason for this is that in twos-complement notation (which most computers use to represent negative numbers), the bitwise-not of 0 is -1 (in 8-bit binary , 11111111
is decimal -1).
Over time it was realised that using a compiler-defined constant would prevent a lot of potential confusion. It's been a while since I've done C++, but I'm fairly sure any non-zero value will still evaluate "true".
The type you use to store a Boolean (usually) embodies a trade-off between space and time. You'll typically get the fastest results (at least for an individual operation) by using an int (typically four bytes). On the other hand, if you're using very many, it can make a lot more sense to using one byte or even pack them so each value you're storing uses only a single bit -- but when/if you do that, reading or writing a single bit becomes substantially more expensive.
Since there was no one answer that was really "right", they left the decision to the user to make based on the requirements of the program they were writing.
The real question, then, is why a Boolean type was added in C99. My guess is that a couple of factors are involved. First, they realized that convenience for the programmer is now usually more important that giving the absolute best performance possible. Second, compilers now do quite a bit more global analysis, so it's at least possible to guess that somebody might write a compiler that tries to pick a representation that's most appropriate for a particular program (though I don't know of any that really does).
Historical reasons, probably:
CPL, which was heavily influenced by ALGOL, most likely had a boolean type, but my google-fu didn't suffice to find a reference for this. But CPL was too ambitious for its time, resulting in a stripped-down version called BCPL, which had the benefit that you could actually implement it on available hardware.
BCPL only had a single type - the 'word' - which was interpreted as false in boolean contexts if 0
and as true if ~0
(meaning the complement of 0
, which would represent the value -1
if interpreted as signed twos-complement integer). The interpretation of any other value was implementation dependant.
After the still typeless successor B, C reintroduced a type system, but it was still heavily influenced by the typeless nature of its predecessors.
If you spend a little time in the library, you don't have to speculate. Here are some statements taken from Dennis Ritchie's paper on the evolution of C. The context is that Dennis is building on Ken Thompson's language B, which was implemented on the very tiny PDP-7, a word-addressed machine. Because of growing interest, the group got one of the very first PDP-11s. Dennis writes,
The advent of the PDP-11 exposed several inadequacies of B's semantic model. First, its character-handling mechanisms, inherited with few changes from BCPL, were clumsy: using library procedures to spread packed strings into individual cells and then repack, or to access and replace individual characters, began to feel awkward, even silly, on a byte-oriented machine.
The B and BCPL model implied overhead in dealing with pointers: the language rules, by defining a pointer as an index in an array of words, forced pointers to be represented as word indices. Each pointer reference generated a run-time scale conversion from the pointer to the byte address expected by the hardware.
For all these reasons, it seemed that a typing scheme was necessary to cope with characters and byte addressing, and to prepare for the coming floating-point hardware. Other issues, particularly type safety and interface checking, did not seem as important then as they became later.
(Emphasis mine.)
The paper goes on to describe Dennis's struggles to invent a new pointer semantics, to make arrays work, and to come to terms with this newfangled struct
idea. Notions of type safety and distinguishing Booleans from integers did not seem important until much later :-)
Old C wasn't really "missing" a boolean type - it was just that all of the integral types were also considered suitable for doing doubly-duty, storing booleans. I can see two main reasons for this:
Bit-addressing processors weren't at all common (and still aren't), so the compiler wouldn't really be able to use a "true boolean" type to save any space - the boolean would still be at least as big as a char
anyway (if you hoped to access it efficiently).
Types narrower than int
are widened to int
in expressions anyway - so the boolean operators would still work on int
operands.
..so it just looks like there wasn't a compelling enough case that a dedicated boolean type would actually convey practical benefits.
Remember that the C language does have a set of operators that produce boolean results (defined to be either 0 or 1) - !
, &&
, ||
, !=
, ==
, <
, <=
, >
and >=
- so it's only a dedicated boolean type that's not there.