tags:

views:

445

answers:

7

Is size of a datatype hardware architecture dependent or compiler dependent?

I want to know what factors really influence in determining the size of a datatype?

+1  A: 

The size is ultimately determined by the compiler. e.g. Java has a fixed set of sizes (8,16,32,64) while the set of sizes offered by C for its various types depends in part on the hardware it runs on; i.e. the compiler makes the choice but (except in cases like Java where datatypes are explicitly independent of underlying hardware) is strongly influenced by what the hardware offers.

Carl Smotricz
This is not entirely correct.. the c stdlib provides fixed size types like int8/int16/int32 etc. these are garanted to be of the specific size regardless of compiler / flags in use.
Shirkrin
`int8_t`, `uint32_t`, etc are only guaranteed to exist on systems where it makes sense for them to exist (think computers with 9-bit bytes). C99 requires "least" types of `int` and `unsigned` from 8 to 64 bits in steps of 8 bits, for example: `int_least8_t` or `uint_least64_t`.
pmg
Good grief, was there even a C compiler for those Symbolics monstrosities?
Crashworks
Sperry/Unisys systems have 9 bit bits and 36 bit ints, Crays had 64 bit everything: sizeof(char) == sizeof(long) == 1. DECsystem-10 had 36 bit registers.
janm
That is of course 9 bit bytes. 9 bit bits would be impressive.
janm
9 bit bytes ??? ... you mean 1byte == 9 bits ??
codingfreak
Yes, on some machines 1 byte is 9 bits. Range 0 to 511.
janm
@ pmg: Sorry, not quite correct. While `intN_t` in general is optional, "if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, it shall define the corresponding typedef names." (ISO/IEC 9899:1999, 7.18.1.1 (3), "Exact-width integer types")
DevSolar
@ pmg: And int_leastN_t is not required "from 8 to 64 bits in steps of 8 bits", but in the distinct widths 8, 16, 32, and 64. (dto, 7.8.1.2 (3), "Minimum-width integer types")
DevSolar
@DevSolar: I was trying to save characters with the "steps of 8 bits" and it turned out to be wrong. Thanks for noticing and alerting me. And about the `intN_t`: I meant what you quote from the Standard.
pmg
Ah, I see now how you meant the intN_t stuff. Indeed, on a 9-bit machine the implementation would *not* be required to provide e.g. int8_t. I missed that quirk the first time around.
DevSolar
A: 

The size of different data types is compiler, and its configuration, dependent (different compilers, or different switches to the same compiler, on some machine can have different sizes).

Usually the compiler is matched to the hardware it is installed on ... so you could say that the sizes of types are also hardware dependent. Making a compiler that emits 16-bit pointers on a machine where they are 48-bits is counter productive.
But it is possible to use a compiler on a computer to create a program meant to run on a different computer with different sizes.

pmg
So you mean to say is .... it is not entirely compiler dependent or architecture dependent ??
codingfreak
It is exclusively compiler dependent.
pmg
An arguable exception is `int`, which the standard defines to have "the natural size suggested by the architecture of the execution environment". But this is hand-waving enough that I doubt anyone would actually declare that a compiler deliberately emulating 64bit on a 32bit machine is non-conforming. Or for that matter that LP64 is non-confirming just because 32bit `int` is not the "natural size" on a 64bit machine. So in practice "natural" is outside the scope of the C standard, and the requirement is ignored.
Steve Jessop
A: 

It depends on the target hardware architecture, operation system and possibly the compiler.

The intel compiler sizes a long integer as follows:

 OS            arch         size
Windows       IA-32        4 bytes
Windows       Intel 64     4 bytes
Windows       IA-64        4 bytes
Linux         IA-32        4 bytes
Linux         Intel 64     8 bytes
Linux         IA-64        8 bytes
Mac OS X      IA-32        4 bytes
Mac OS X      Intel 64     8 bytes  

Here as link to show the sizes on microsoft visual c++ compiler.

Joakim Karlsson
you mean Intel C compiler ??
codingfreak
The intel c++ compiler. http://software.intel.com/en-us/articles/size-of-long-integer-type-on-different-architecture-and-os/
Joakim Karlsson
A: 

The answer to your question is is yes, I'll explain.

Consider common storage types, i.e. size_t, int64_t, etc. These are decided (defined in most cases) at compile time depending on the architecture that you are building on. Don't have them? Fear not, the compiler figured out the underlying meaning of int and how unsigned effects it.

Once the standard C headers figure out the desired word length, everything just adjusts to your system.

Unless, of course, you happen to be cross compiling, then they are decided (or defined) by whatever architecture you specified.

In short, your compiler (well, mostly preprocessor) is going to adjust types to fit the target architecture. Be it the one you are using, or the one you are cross compiling for.

That is if I understand your question correctly. This is one of the few 'magic' abstractions that the language provides and part of the reason why its often called 'portable assembly'.

Tim Post
So it is not entirely compiler dependent or architecture dependent ?? ... But pmg answers that it is exclusively compiler dependent
codingfreak
@codingfreak: The compiler can be told to build for any architecture. By default, it builds for the one it, itself, is built for, which happens to be your's :)
Tim Post
A: 

The size of the "native" datatypes is up to the compiler. While this in turn is influenced by the hardware, I wouldn't start guessing.

Have a look at <stdint.h> - that header has platform-independent typedef's that should cater for whatever needs you might have.

DevSolar
A: 

\\ It is exclusively compiler dependent.

Or to say more correctly C language standard dependent (http://en.wikipedia.org/wiki/C99).

As in the standard are clearly assigned sizes of bult-in types. But. They are not fixed "one size for all" there.
There is just a minimal size (e.g. char is at least 8bit) need to be preserved by any compiler on any architecture.
But also it can be 16 or even 32 bit char depend on arch.
And also there is reletive sizes between different types are preserved.
It mean, that for example short is can be 8, 16 or 32 bit, but it cannot be more in length then more wide type int on the same architecture.
Only smaller or same length.

That is a borders inside whith compiler developers must work if they want to make C standard compatible compiler.

so even C99 standard does influence in deciding the range of the datatypes ...
codingfreak
+1  A: 

The compiler (more properly the "implementation") is free to choose the sizes, subject to the limits in the C standard (for instance int must be at least 16 bits). The compiler optionally can subject itself to other standards, like POSIX, which can add more constraints. For example I think POSIX says all data pointers are the same size, whereas the C standard is perfectly happy for sizeof(int*) != sizeof(char*).

In practice, the compiler-writer's decisions are strongly influenced by the architecture, because unless there's a strong reason otherwise they want the implementation to be efficient and interoperable. Processor manufacturers or OS vendors often publish a thing called a "C ABI", which tells you (among other things), how big the types are and how they're stored in memory. Compilers are never obliged to follow the standard ABI for their architecture, and CPUs often have more than one common ABI anyway, but to call directly from code out of one compiler to code out of another, both compilers have to be using the same ABI. So if your C compiler doesn't use the Windows ABI on Windows, then you'd need extra wrappers to call into Windows dlls. If your compiler supports multiple platforms, then it quite likely uses different ABIs on different platforms.

You often see abbreviations used to indicate which of several ABIs is in use. So for instance when a compiler on a 64 bit platform says it's LP64, that means long and pointers are 64bit, and by omission int is 32bit. If it says ILP64, that means int is 64bit too.

In the end, it's more a case of the compiler-writer choosing from a menu of sensible options, than picking numbers out of the air arbitrarily. But the implementation is always free to do whatever it likes. If you want to write a compiler for x86 which emulates a machine with 9-bit bytes and 3-byte words, then the C standard allows it. But as far as the OS is concerned you're on your own.

Steve Jessop