tags:

views:

3241

answers:

7

What is the relation between word length, character size, integer size, and byte in C++?

+3  A: 

Standard C++ doesn't have a datatype called word or byte. The rest are well defined as ranges. The base is a char which has of CHAR_BITS bits. The most commonly used value of CHAR_BITS is 8.

dirkgently
char is 1 byte by standard
Artyom
@Artyom: I specified what a byte means (since it is not required to be always 8 bits -- a common misconception).
dirkgently
A: 

Kind of depends on what you mean by relation. The size of numeric types is generally a multiple of the machine word size. A byte is a byte is a byte -- 8 bits, no more, no less. A character is defined in the standard as a single unsigned byte I believe (check your ARM for details).

The general rule is, don't make any assumptions about the actual size of data types. The standard specifies relationships between the types such as a "long" integer will be either the same size or larger than an "int". Individual implementations of the language will pick specific sizes for the types that are convenient for them. For example, a compiler for a 64-bit processor will choose different sizes than a compiler for a 32-bit processor.

You can use the sizeof() operator to examine the specific sizes for the compiler you are using (on the specific target architecture).

Jeff Kotula
A: 

1 bytes = 8 bits

1 word = 4 bytes

1 char = 1 byte

1 wchar_t = compiler specific (>= 1 byte)

1 int = compiler specific

eduffy
-1: word size is implementation-specific
Mr Fooz
+5  A: 

The standard requires that certain types have minimum sizes (short is at least 16 bits, int is at least 16 bits, etc), and that some groups of type are ordered (sizeof(int) >= sizeof(short) >= sizeof(char)). But that is it.

dmckee
One minor edit should be made - int only needs to be at least 16 bits.
Michael Burr
Yep. Checked it, fixed it. Thanks.
dmckee
+5  A: 

In C++ a char must be large enough to hold any character in the implemetation's basic character set.

int has the "natural size suggested by the architecture of the execution environment". Note that this means that an int does not need to be at least 32-bits in size. Implementations where int is 16 bits are common (think embedded ot MS-DOS).

The following are taken from various parts of the C++98 and C99 standards:

  • long int has to be at least as large as int
  • int has to be at least as large as short
  • short has to be at least as large as char

Note that they could all be the same size.

Also (assuming a two's complement implementation):

  • long int has to be at least 32-bits
  • int has to be at least 16-bits
  • short has to be at least 16-bits
  • char has to be at least 8 bits
Michael Burr
+4  A: 

The Standard doesn't know this "word" thingy used by processors. But it says the type "int" should have the natural size for a execution environment. But even for 64 bit environments, int is usually only 32 bits. So "word" in Standard terms has pretty much no common meaning (except for the common English "word" of course).

Character size is the size of a character. Depends on what character you talk about. Character types are char, unsigned char and signed char. Also wchar_t is used to store characters that can have any size (determined by the implementation - but must use one of the integer types as its underlying type. Much like enumerations), while char/signed char or unsigned char has to have one byte. That means that one byte has as much bits as one char has. If an implementation says one object of type char has 16 bits, then a byte has 16 bits too.

Now a byte is the size that one char occupies. It's a unit, not some specific type. There is not much more about it, just that it is the unit that you can access memory. I.e you do not have pointer access to bit-fields, but you have access to units starting at one byte.

"Integer size" now is pretty wide. What do you mean? All of bool, char, short, int, long and their unsinged counterparts are integers. Their range is what i would call "integer size" and it is documented in the C standard - taken over by the C++ Standard. For signed char the range is from -127 <-> 127, for short and int it is the same and is -2^15+1 <-> 2^15-1 and for long it is -2^31+1 <-> 2^31-1. Their unsigned counterparts range from 0 up to 2^8-1, 2^16-1 and 2^32-1 respectively. Those are however minimal sizes. That is, an int may not have maximal size 2^14 on any platform, because that is less than 2^15-1 of course. It follows for those values that a minimum of bits is required. For char that is 8, for short/int that is 16 and for long that is 32. Two's-complement representation for negative numbers is not required, which is why the negative value is not -128 instead of -127 for example for signed char.

Johannes Schaub - litb
+1  A: 

sizeof( char ) == 1 ( one byte ) (in c++, in C - not specified)
sizeof( int ) >= sizeof( char )
word - not c++ type, usualy in computer architecture it mean 2 bytes

bb