tags:

views:

1530

answers:

11

sizeof is a C keyword. It returns the size in a type named size_t. However, size_t is not a keyword, but is defined primarily in stddef.h and probably other C standard header files too.

Consider a scenario where you want to create a C program which does not include any C standard headers or libraries. (Like for example, if you are creating an OS kernel.) Now, in such code, sizeof can be used (it is a C keyword, so it is a part of the language), but the type that it returns (size_t) is not available!

Does not this signify some kind of a problem in the C standard specification? Can you clarify this?

+2  A: 

size_t is actually a type - often an unsigned int. Sizeof is an operator that gives the size of a type. The type returned by sizeof is actually implementation-specific, not a C standard. It's just an integer.

Edit: To be very clear, you do not need the size_t type in order to use sizeof. I think the answer you're looking for is - Yes, it is inconsistent. However, it doesn't matter. You can still practically use sizeof correctly without having a size_t definition from a header file.

Tony k
Anthony: I'm aware of that. But, size_t is defined in C headers, it's not a part of the language. That's where the disconnect is.
Ashwin
in the c++ standard at least (I don't have the C standard) sizeof is defined as returning size_t not just an integer type, and the clause explicitly referenced stddef.h
anon
Ash: The following code is valid:int main() {double a;unsigned int b = sizeof(a);}Without any headers, this is still correct. The lowest common denominator of size_t is that it is an unsigned integral type. Size_t is basically just a typedef.
Tony k
Anthony Kanago: This might also imply that the compiler you're using is aware of size_t. If we get pedantic, this may not be a valid case since the standard doesn't say the compiler should know about size_t.
Ashwin
+2  A: 

There is no reason not to include stddef.h, even if you are working on a kernel - it defines type sizes for your specific compiler that any code will need.

Note also that almost all C compilers are self-compiled. The actual compiler code for the sizeof operator will therefore use size_t and reference the same stddef.h file as does user code.

anon
Neil: But just for argument sake, let's say I want to create a program that doesn't use any C standard headers or libraries. Just a pure C language program.
Ashwin
The header files are _part_ of the C language!
anon
Neil Butterworth: I couldn't grok it in your reply, but understood it when I read Volte's reply :-) Now to complicate matters a bit, what will happen with a cross compiler whose native and destination platforms require different size_t? ;-)
Ashwin
that of course is why I said "almost all" :-)
anon
+1  A: 

size_t is not a keyword by necessity. Different architectures often have different sizes for integral types. For example a 64 bit machine is likely to have an unsigned long long as size_t if they didn't decide to make int a 64 bit datatype.

If you make sizeof a builtin type to the compiler, then it will take away the power to do cross compilation.

Also, sizeof is more like a magic compile time macro (think c++ template) which explains why it is a keyword instead of defined type.

Unknown
This argument is moot since even the size of integral types (for example) change depending on the platform. An int could be 4 bytes or 8 bytes and so on, but int is still a keyword.
Ashwin
+2  A: 

From MSDN:

When the sizeof operator is applied to an object of type char, it yields 1

Even if you don't have stddef.h available/included and don't know about size_t, using sizeof you can get the size of objects relative to char.

schnaader
But if you want to store that size somewhere, without risking overflow, you will need the typedef of size_t.
anon
schnaader: All that says is that size_t will be some sort of an integral type, but not exactly what type.
Ashwin
@Neil - Honestly, how often do you find a type for which sizeof() returns something that would overflow even an unsigned char? On my system, even a FILE struct is only 88 bytes.
Chris Lutz
+12  A: 

sizeof is a keyword because, despite it's name and usage, it is an operator like + or = or < rather than a function like printf() or atoi() or fgets(). A lot of people forget (or just don't know) that sizeof is actually an operator, and is always resolved at compile-time rather than at runtime.

The C language doesn't need size_t to be a usable, consistent language. That's just part of the standard library. The C language needs all operators. If, instead of +, C used the keyword plus to add numbers, you would make it an operator.

Besides, I do semi-implicit recasting of size_ts to unsigned ints (and regular ints, but Kernighan and Ritchie will someday smite me for this) all the time. You can assign the return type of a sizeof to an int if you like, but in my work I'm usually just passing it straight on to a malloc() or something.

Chris Lutz
Chris: Thanks for that cogent explanation. But, doesn't that imply that size_t is intimately inside C, like int/char is. Why define it in a header then?
Ashwin
Not that it's widely used (or even supported), but for C99-style variable length arrays, sizeof is evaluated at runtime.
Michael Burr
@Michael - Seriously? Damn. However, C99 is really poorly supported, so I'm sticking to my malloc() and realloc() and free() to do that job for a while.
Chris Lutz
Michael Burr: That is news to me! Thanks for pointing it out :-)
Ashwin
+1  A: 

The simple reason is because it is not a fundamental type. If you look up the C standard you will find that fundamental types include int, char etc but not size_t. Why so? As others have already pointed out, size_t is an implementation specific type (i.e. a type capable of holding the size in number of "C bytes" of any object).

On the other hand, sizeof is an (unary) operator. All operators are keywords.

dirkgently
+29  A: 

It does not literally return a value of type size_t since size_t is not a concrete type in itself, but rather a typedef to an unspecified built-in type. Typedef identifiers (such as size_t) are completely equivalent to their respective underlying types (and are converted thereto at compile time). If size_t is defined as an unsigned int on your platform, then sizeof returns an unsigned int when it is compiled on your system. size_t is just a handy way of maintaining portability and only needs to be included in stddef.h if you are using it explicitly by name.

Volte
+1, the right answer. Anything that returns a typedef cannot really be said to return that typedef. It returns whatever the typedef boils down to. If you want portability, include stddef.h. If you don't, look up the definition of size_t on your platform and use that type directly instead.
Daniel Earwicker
Volte: I'm seeing the picture now. So, the compiler cannot "see" size_t, but use a type for sizeof that is exactly the same as what size_t is typedef to in the headers that ship with the compiler.
Ashwin
As pointed out in my answer, the compiler did "see" the typedef of size)t when it itself was compiled - in a sense it "knows" about size_t
anon
Neil Butterworth: I couldn't grok it in your reply, but understood it when I read Volte's reply :-) Now to complicate matters a bit, what will happen with a cross compiler whose native and destination platforms require different size_t? ;-)
Ashwin
The size_t that the compiler saw is at best an implementation detail since conforming C compilers could be written in any language. It is a matter of standard that sizeof's return type is equivalent to size_t, but the actual underlying type is left to the implementation of the compiler.
Volte
@Neil You're not just wrong in this case, you're categorically wrong.
Frank Crook
+4  A: 

Some headers from the C standard are defined for a freestanding environment, i.e. fit for use e.g. in an operating system kernel. They do not define any functions, merely defines and typedefs.

They are float.h, iso646.h, limits.h, stdarg.h, stdbool.h, stddef.h and stdint.h.

When working on an operating system, it isn't a bad idea to start with these headers. Having them available makes many things easier in your kernel. Especially stdint.h will become handy (uint32_t et al.).

DevSolar
DevSolar: OS kernel was just an example, instead consider a case where I want to write a C program using sizeof, but not including any C header. Thanks for the info about the freestanding environment headers, I wasn't aware of that.
Ashwin
+2  A: 

I think that the main reasons that size_t is not a keyword are:

  • there's no compelling reason for it to be. The designers of the C and C++ languages have always preferred to have language features be implemented in the library if possible and reasonable
  • adding keywords to a language can create problems for an existing body of legacy code. This is another reason they are generally resistant to adding new keywords.

For example, in discussing the next major revision of the C++ standard, Stroustrup had this to say:

The C++0x improvements should be done in such a way that the resulting language is easier to learn and use. Among the rules of thumb for the committee are:

...

  • Prefer standard library facilities to language extensions

...

Michael Burr
A: 

FYI to people in this thread; sizeof() is a ordinary macro. You could write it yourself. Something like this...

#define sizeof(type) (((type *) NULL) + 1)

Blank Xavier
This is simply untrue. For one thing, sizeof works on both variables (no parens are required in this case) and type names. Such an implementation may work for some cases but there's no way to implement it properly using a simple preprocessor macro.
Volte
Agreed with @Volte.
Jonathan Leffler
With variables you have to take the address of the variable, to convert it into a pointer to its type, then add one.
Blank Xavier
You are correct. I had read about sizeof() as a macro, but it was for variables only; you need a second version for types.
Blank Xavier
At least since the first C standards (ANSI and ISO/IEC 9899:1990) sizeof is specified to be an operator and overriding with a macro is horribly wrong.
hlovdal
+2  A: 

Does not this signify some kind of a problem in the C standard specification?

Look up the difference between a hosted implementation of C and a freestanding C implementation. The freestanding (C99) implementation is required to provide headers:

  • <float.h>
  • <iso646.h>
  • <limits.h>
  • <stdarg.h>
  • <stdbool.h>
  • <stddef.h>
  • <stdint.h>

These headers do not define any functions at all. They define parts of the language that are somewhat compiler specific (for example, the offsetof macro in <stddef.h>, and the variable argument list macros and types in <stdarg.h>), but they can be handled without actually being built into the language as full keywords.

This means that even in your hypothetical kernel, you should expect the C compiler to provide these headers and any underlying support functions - even though you provide everything else.

Jonathan Leffler
Jonathan: That cleared all my doubts, thanks! :-)
Ashwin