That's all. Didn't find any similar topic so bear with me it there is.
No, it is not guaranteed to be 8-bits. sizeof(char) is guaranteed to be 1, but that does not necessarily mean one 8-bit byte.
no, char data type must contain at least 8 bits (see ANSI C specification)
From a copy of the ANSI C specification, see Section 3.1.2.5 - Types:
An object declared as type char is large enough to store any member of the basic execution character set. If a member of the required source character set enumerated in $2.2.1 is stored in a char object, its value is guaranteed to be positive. If other quantities are stored in a char object, the behavior is implementation-defined: the values are treated as either signed or nonnegative integers.
The concept of "execution character set" is introduced in Section 2.2.1 - Character sets.
In other words, a char has to be at least big enough to contain an encoding of at least the 95 different characters which make up the basic execution character set.
Now add to that the section 2.2.4.2 - Numerical limits
A conforming implementation shall document all the limits specified in this section, which shall be specified in the headers
<limits.h>
and<float.h>
.Sizes of integral types
The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
maximum number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8minimum value for an object of type signed char
SCHAR_MIN -127maximum value for an object of type signed char
SCHAR_MAX +127maximum value for an object of type unsigned char
UCHAR_MAX 255....
So there you have it - the number of bits in a char must be at least 8.
The C99 standard draft says that a byte must be at least 8-bit wide, because <limits.h>
contains a macro CHAR_BIT
which yields the number of bits per byte, and is guaranteed to be at least 8 (§5.2.4.2.1).
The C++ standard draft includes C's <limits.h>
under the name <climits>
(§18.2.2).
From the C standard describing limits.h (some reformatting required):
- number of bits for smallest object that is not a bit-field (byte): CHAR_BIT 8
- minimum value for an object of type signed char: SCHAR_MIN -127
- maximum value for an object of type signed char: SCHAR_MAX +127
CHAR_BIT minimum of 8 ensures that a character is at least 8-bits wide. The ranges on SCHAR_MIN and SCHAR_MAX ensure that representation of a signed char uses at least eight bits.
First thing I would say is that if you need a type to be an exact number of bits, then use a size specific type. Depending on your platform that could range from __s8
for a signed 8 bit type on Linux to __int8
in VC++ on Windows.
Now, according to Robert Love in his chapter on portability in "Linux Kernel Development" he states that the C standard "leaves the size of the standard types up to implementations, although it does dictate a minimum size."
Then in a footnote at the bottom of the page he says, "With the exception of char
which is always 8 bits"
Now I'm not sure what he's basing this on, but maybe it's this section from the ANSI C spec?
2.2.4.2 Numerical limits
A conforming implementation shall document all the limits specified in this section, which shall be specified in the headers limits.h and float.h
"Sizes of integral types limits.h"
The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
maximum number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
minimum value for an object of type signed char
SCHAR_MIN -127
maximum value for an object of type signed char
SCHAR_MAX +127
maximum value for an object of type unsigned char
UCHAR_MAX 255
minimum value for an object of type char
CHAR_MIN see below
maximum value for an object of type char
CHAR_MAX see below
maximum number of bytes in a multibyte character, for any supported locale
MB_LEN_MAX 1
minimum value for an object of type short int
SHRT_MIN -32767
maximum value for an object of type short int
SHRT_MAX +32767
maximum value for an object of type unsigned short int
USHRT_MAX 65535
minimum value for an object of type int
INT_MIN -32767
maximum value for an object of type int
INT_MAX +32767
maximum value for an object of type unsigned int
UINT_MAX 65535
minimum value for an object of type long int
LONG_MIN -2147483647
maximum value for an object of type long int
LONG_MAX +2147483647
maximum value for an object of type unsigned long int
ULONG_MAX 4294967295
If the value of an object of type char sign-extends when used in an expression, the value of CHAR_MIN shall be the same as that of SCHAR_MIN and the value of CHAR_MAX shall be the same as that of SCHAR_MAX . If the value of an object of type char does not sign-extend when used in an expression, the value of CHAR_MIN shall be 0 and the value of CHAR_MAX shall be the same as that of UCHAR_MAX ./7/