tags:

views:

393

answers:

7

I asked a similar question before but I'm still confused... can you generally make any assumptions about the minimum size of a data type?

What I have read so far

char 1 Byte short 2 Byte int 2 Byte, typically 4 Byte long 4 Byte

float??? double???

Are the values in float.h and limits.h system dependent?

Thx again, Oliver

+7  A: 

Yes, the values in float.h and limits.h are system dependent. You should never make assumptions about the width of a type, but the standard does lay down some minimums. See §6.2.5 and §5.2.4.2.1 in the C99 standard.

For example, the standard only says that a char should be large enough to hold every character in the execution character set. It doesn't say how wide it is.

For the floating-point case, the standard hints at the order in which the widths of the types are given:

§6.2.5.10

There are three real floating types, designated as float, double, and long double.34) The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double.

They implicitly defined which is wider than the other, but not specifically how wide they are. "Subset" itself is vague, because a long double can have the exact same range of a double and satisfy this clause.

This is pretty typical of how C goes, and a lot is left to each individual environment. You can't assume, you have to ask the compiler.

Jed Smith
A char is required to be one byte. But a byte is not required to be 8 bits.
Chris Conway
+11  A: 

This is covered in the wikipedia article:

A short int must not be larger than an int.

An int must not be larger than a long int.

A short int must be at least 16 bits long.

An int must be at least 16 bits long.

A long int must be at least 32 bits long.

A long long int must be at least 64 bits long.

The standard does not require that any of these sizes be necessarily different. It is perfectly valid, for example, if all four types are 64 bits long.

Suppressingfire
There's a common misconception that char = byte = 8-bits. As stated in this answer, a char could very well be some other size (like 64-bits). sizeof just measures "how many chars would fit" in a given datatype. (so sizeof(char) == 1 by definition)
Laurence Gonsalves
@Laurence: `char` isn't discussed in this answer at all?
Jed Smith
@Jed: it's true that I didn't mention it, but it's a good piece of information to include in the answer (maybe it would have better fit as a comment to the question, though).
Suppressingfire
`char` is defined as at least 8 bits (`<limits.h> CHAR_BIT` is required to be at least 8).
Pavel Minaev
Moreover, both POSIX and Windows require `CHAR_BIT` to be 8, so that means `CHAR_BIT` essentially **is 8** on ordinary desktop/workstation/server environments. The only exceptions are ancient legacy mainframes, DSPs, and possibly some embedded systems.
R..
+3  A: 

However, the new C99 specifies (in stdint.h) optional types of minimal sizes, like uint_least8_t, int_least32_t, and so on.. (see http://en.wikipedia.org/wiki/Stdint.h)

J S
If your implementation doesn't have cstdint, try boost's cstdint which places the correct types in the boost namespace. I believe it delegates to the implementations cstdint if it has one.
KitsuneYMG
@kts, Boost is a C++ library, the question is tagged C.
mctylr
In the case of boost.cstdint it is both. Cstdint is almst entirely typedefs
KitsuneYMG
A: 

Most of the libraries define something like this:

#ifdef MY_ARCHITECTURE_1
typedef unsigned char u_int8_t;
typedef short int16_t;
typedef unsigned short u_int16_t;
typedef int int32_t;
typedef unsigned int u_int32_t;
typedef unsigned char u_char;
typedef unsigned int u_int;
typedef unsigned long u_long;
typedef unsigned short u_short;
#endif

you can then use those typedef in your programs instead of the standard types.

Pierre
+1  A: 

Quoting the standard does give what is defined to be "the correct answer" but it doesn't actually reflect the way programs are generally written.

People make assumptions all the time that char is 8 bits, short is 16, int is 32, long is either 32 or 64, and long long is 64.

Those assumptions are not a great idea but you will not get fired for making them.

In theory, <stdint.h> can be used to specify fixed-bit-width types, but you have to scrounge one up for Microsoft. (See here for a MS stdint.h.) One of the problems here is that C++ technically only needs C89 compatibility to be a conforming implementation; even for plain C, C99 is not fully supported even in 2009.

It's also not accurate to say there is no width specification for char. There is, the standard just avoids saying whether it is signed or not. Here is what C99 actually says:

  • number of bits for smallest object that is not a bit-field (byte)
    CHAR_BIT 8
  • minimum value for an object of type signed char
    SCHAR_MIN -127 // -(27 - 1)
  • maximum value for an object of type signed char
    SCHAR_MAX +127 // 27 - 1
  • maximum value for an object of type unsigned char
    UCHAR_MAX 255 // 28 - 1
DigitalRoss
Those are minimum requirements. There's nothing in the Standard that forbids an implementation to define CHAR_BIT as `9`, or `16`, `18`, ...
pmg
+1  A: 

Often developers asking this kind of question are dealing with arranging a packed struct to match a defined memory layout (as for a message protocol). The assumption is that the language should directly specify laying out 16-, 24-, 32-bit, etc. fields for the purpose.

That is routine and acceptable for assembly languages and other application-specific languages closely tied to a particular CPU architecture, but is sometimes a problem in a general purpose language which might be targeted at who-knows-what kind of architecture.

In fact, the C language was not intended for a particular hardware implementation. It was specified generally so a C compiler implementer could properly adapt to the realities of a particular CPU. A Frankenstein hardware architecture consisting of 9 bit bytes, 54 bit words, and 72 bit memory addresses is easily—and unambiguously—mapped to C features. (char is 9 bits; short int, int, and long int are 54 bits.)

This generality is why the C specification says something to the effect of "don't expect much about the sizes of ints beyond sizeof (char) <= sizeof (short int) <= sizeof (int) <= sizeof (long int)." That implies that chars could be the same size as longs!

The current reality is—and the future seems to hold—that software demands architectures provide 8-bit bytes and that memory words addressable as individual bytes. This wasn't always so. Not too long ago, I worked on an the CDC Cyber architecture which features 6 bit "bytes" and 60 bit words. A C implementation on that would be interesting. In fact, that architecture is responsible for the weird packing semantics of Pascal—if anyone remembers that.

wallyk
It is common on modern DSP architectures for the smallest addressable unit to be larger than 8bits, eg 32 bits.
caf
wallyk
+1  A: 

If you wan't to check the size (in multiples of chars) of any type on your system/platform really is the size you expect, you could do:

enum CHECK_FLOAT_IS_4_CHARS
{
   IF_THIS_FAILS_FLOAT_IS_NOT_4_CHARS = 1/(sizeof(float) == 4)
};
S.C. Madsen
Huh?! Normally an assert is used. E.g. assert(sizeof(float) == 4);
mctylr
'assert' is a runtime function, hence to evaluate the statement the program needs to run. The above statement causes an error at compile-time if sizeof(float) is not 4.
S.C. Madsen
struct { char check_float_is_4_chars[2*(sizeof(float)==4)-1]; };
R..