tags:

views:

1457

answers:

8

Can someone point me the to the implementation of sizeof operator in C++ and also some description about its implementation.

sizeof is one of the operator that cannot be overloaded.

So it means we cannot change its default behavior?

+11  A: 

http://en.wikipedia.org/wiki/Sizeof

Basically, to quote Bjarne Stroustrup's C++ FAQ:

Sizeof cannot be overloaded because built-in operations, such as incrementing a pointer into an array implicitly depends on it. Consider:

X a[10];
X* p = &a[3];
X* q = &a[3];
p++; // p points to a[4]
 // thus the integer value of p must be
 // sizeof(X) larger than the integer value of q

Thus, sizeof(X) could not be given a new and different meaning by the programmer without violating basic language rules.

Yuval A
It doesn't give implementation details . The exact function definition could be better option.
Alien01
sizeof cannot be a function because it accepts both types and values: compare sizeof (int) and sizeof ("123").
Anton Tykhyy
Anton, it could be a template and accept a type very well
Johannes Schaub - litb
but i guess the problem of returning a compile time constant would still remain. although it could return something like template<typename T> int_<I> operator sizeof() { ... } and I would be a compile time constant depending on T
Johannes Schaub - litb
it would not even need a body then. just a declaration like that would suffice with a nested static int const value member :p
Johannes Schaub - litb
the default implementation would be template<typename T> int_< ::sizeof(T) > operator sizeof(); (note the space between < and :: to avoid the digraph <: . but now, luke Yuval explained greatly, any definition would just violate basic language rules :p
Johannes Schaub - litb
@litb: you can't define sizeof in C++ so that (sizeof(int)) will be legal, though obviously you *could* have sizeof<int>. sizeof is a separate production in the grammar, and there's a reason for it.
Anton Tykhyy
sizeof is an operator like ++ and -- just with the difference that it evaluates at compile time already. if you have rules that state how sizeof exp | sizeof (type-id) use a template declaration to determine the result, then it becomes valid. but as Yuval said, it would horribly violate basic rules
Johannes Schaub - litb
You could #define sizeof, but that is evil enough to go to hell :)
Robert Gould
@litb: I don't think (you can prove me wrong) that you can create a template construct that will render both 'identifier( type )' and 'identifier( variable )' correct and both resolve to a value. A function declaration/call match the pattern, but a declaration will not yield a value.
David Rodríguez - dribeas
dribeas, of course there will be language support needed. the static type of "variable" would be passed as the template argument for example, and what is returned will be handled special (i.e as the result, a nested member like ::value is used, which will evaluate at compile time).
Johannes Schaub - litb
like for example there is language support for user defined literals in c++1x: doing 100101_b can result in invocation of a operator "" _b <'1', '0', '0', '1', '0', '1'>(); (where that is declared as template<char... V> int operator "" _b();)
Johannes Schaub - litb
or what is returned will be just - as in c++1x too :) - a constexpr. constexpr functions can be called and their return value used as constant expressions, which make them perfectly valid for stuff where sizeof is used today
Johannes Schaub - litb
+19  A: 

sizeof is not a real operator in C++. It is merely special syntax which inserts a constant equal to the size of the argument. sizeof doesn't need or have any runtime support.

Edit: do you want to know how to determine the size of a class/structure looking at its definition? The rules for this are part of the ABI, and compilers merely implement them. Basically the rules consist of

  1. size and alignment definitions for primitive types;
  2. structure, size and alignment of the various pointers;
  3. rules for packing fields in structures;
  4. rules about virtual table-related stuff (more esoteric).

However, ABIs are platform- and often vendor-specific, i.e. on x86 and (say) IA64 the size of A below will be different because IA64 does not permit unaligned data access.

struct A
{
    char i ;
    int  j ;
} ;

assert (sizeof (A) == 5)  ; // x86, MSVC #pragma pack(1)
assert (sizeof (A) == 8)  ; // x86, MSVC default
assert (sizeof (A) == 16) ; // IA64
Anton Tykhyy
So how does it calculate the constant value ie size of variable?
Alien01
Alien01, it does not calculate the size of the variable. sizeof() is resolved ("executed", if you will) at *compile time*. The compiler will insert the correct value - which it knows already. E.g., on a 32-bit x86 architecture, it will know that ints are 32 bit long, pointers are 32 bit, etc.
Mihai Limbășan
Alien01: I added a reference on how does the *compiler* calculate structure sizes. At runtime, structure sizes are fixed and known, the same as when you write `int i = 8 ;` nobody actually *calculates* 8 at runtime.
Anton Tykhyy
I think that by default on pretty much all compilers struct A will have sizeof(A)==2. Even though ia64 requires strict alignment, a char can be addressed at any address (ie., its alignment requirement is 1). This doesn't really detract from your answer - just that specific example.
Michael Burr
Another Note: ABI are not platform but compiler specific when talking about C++. Each compiler will implement its own version. Though the C ABI is standard across all platforms/compilers (but POD sizes will very).
Martin York
@Michael: you're right, I fixed my example accordingly.@Martin: thanks, clarified my answer here.
Anton Tykhyy
Martin, there is no Standard C ABI. It differs between platforms and compilers, just like it does with C++. i've heard people say that here on SO sometimes, but it's wrong. there can't be a cross platform ABI. Already because argument passing - in which order and whatnot is unspecified
Johannes Schaub - litb
@Johannes, the absense of an official ABI standard does not prevent the existence of ad hoc ones. It is true often enough to be a useful statement.
Mark Ransom
+5  A: 

No, you can't change it. What do you hope to learn from seeing an implementation of it?

What sizeof does can't be written in C++ using more basic operations. It's not a function, or part of a library header like e.g. printf or malloc. It's inside the compiler.

Edit: If the compiler is itself written in C or C++, then you can think of the implementation being something like this:

size_t calculate_sizeof(expression_or_type)
{
   if (is_type(expression_or_type))
   {
       if (is_array_type(expression_or_type))
       {
           return array_size(exprssoin_or_type) * 
             calculate_sizeof(underlying_type_of_array(expression_or_type));
       }
       else
       {
           switch (expression_or_type)
           {
                case int_type:
                case unsigned_int_type:
                     return 4; //for example
                case char_type:
                case unsigned_char_type:
                case signed_char_type:
                     return 1;
                case pointer_type:
                     return 4; //for example

                //etc., for all the built-in types
                case class_or_struct_type:
                {
                     int base_size = compiler_overhead(expression_or_type);
                     for (/*loop over each class member*/)
                     {
                          base_size += calculate_sizeof(class_member) +
                              padding(class_member);
                     }
                     return round_up_to_multiple(base_size,
                              alignment_of_type(expression_or_type));
                }
                case union_type:
                {
                     int max_size = 0;
                     for (/*loop over each class member*/)
                     {
                          max_size = max(max_size, 
                             calculate_sizeof(class_member));
                     }
                     return round_up_to_multiple(max_size,
                            alignment_of_type(expression_or_type));
                }
           }
       }
   }
   else
   {
       return calculate_sizeof(type_of(expression_or_type));
   }
}

Note that is is very much pseudo-code. There's lots of things I haven't included, but this is the general idea. The compiler probably doesn't actually do this. It probably calculates the size of a type (including a class) and stores it, instead of recalculating every time you write sizeof(X). It is also allowed to e.g. have pointers being different sizes depending on what they point to.

Doug
"What do you hope to learn from seeing an implementation of it?"This doesn't look like a valid answer.
Alien01
I know it's a dodge, but you might be actually interested in some related problem that has an easier answer...
Doug
+5  A: 

sizeof does what it does at compile time. Operator overloads are simply functions, and do what they do at run time. It is therefore not possible to overload sizeof, even if the C++ Standard allowed it.

anon
+3  A: 

sizeof is a compile-time operator, which means that it is evaluated at compile-time.

It cannot be overloaded, because it already has a meaning on all user-defined types - the sizeof() a class is the size that the object the class defines takes in memory, and the sizeof() a variable is the size that the object the variable names occupies in memory.

Avi
A: 

Take a look at the source for the Gnu C++ compiler for an real-world look at how this is done.

James Moore
A: 

Unless you need to see how C++-specific sizes are calculated (such as allocation for the v-table), you can look at Plan9's C compiler. It's much simpler than trying to tackle g++.

A: 

Variable:

#define getsize_var(x) ((char *)(&(x) + 1) - (char *)&(x))

Type:

#define getsize_type(type) ( (char*)((type*)(1) + 1) - (char*)((type *)(1)))
Kedar