views:

1925

answers:

6

Hello, how expensive is it to perform the dereference operation on a pointer in C++? I can imagine that the memory transfer is somehow proportional to the object size, but I want to know how expensive the dereference operation part is.

Thank you.

+1  A: 

Dereferencing can be expensive mostly because it costs an instruction to fetch data from memory which might be far away and do not exhibit locality of reference. In that case, the processor should fetch data from non-cached memory and even hard disk (in case of a hard page fault).

Mehrdad Afshari
+22  A: 

Dereferencing, when translated into machine code, can mean different things depending on what you do with the dereferenced object. Accessing a single member of a class through a pointer is typically cheap. For example if c is a pointer to an instance of class C with an int member n then something like this:

int n = c->n;

Might translate into one or two machine instructions and might load a register with a single memory access.

On the other hand this implies making a complete copy of the object pointed to by c:

C d = *c;

The cost of this will depend on the size of C, but note that it is the copy that is the major expense and the 'dereference' part is really just 'using' the pointer address in the copy instructions.

Note that accessing members of large objects typically requires pointer offset calculation and memory access whether or not the object is a local object or not. Typically only very small objects are optimized to live only in registers.

If you are concerned about the cost of pointers over references then don't be. The difference between these are a language semantics difference and by the time the machine code is generated pointer and reference access look exactly the same.

Charles Bailey
Thank you very much for your response.
tunnuz
In some cases references can offer the optimizer more opportunities for cleverness, because a references cannot be modified, while a pointer can be.
Roddy
@Roddy: I think there are opportunities if using references as parameters to inline methods instead of pointers.
rstevens
+1 nice explanation
JonH
A: 

The dereferencing of a pointer shouldn't be much more than copying an address to a (address)register. Thats all.

EricSchaefer
Dereferencing usually means reading from that address as well. Which may vary in cost depending on cache, locality and such.
jalf
It does not matter what the dereferenced pointer is "usually" used for. Dereferencing is a well defined operation and it doesnt mean much more that copying an address from a memory location to an address register (or whatever is nessecary on the specific platform). See the answer of litb.
EricSchaefer
+22  A: 

It depends on what you do with the dereferenced pointer. A mere dereference operation does nothing in itself. It just gets an lvalue of type T which represents your object, if your pointer is a T*

struct a {
    int big[42];
};

void f(a * t) {
    // does nothing. Only interesting for standard or compiler writers.
    // it just binds the lvalue to a reference t1. 
    a & t1 = *t; 
}

If you actually get the value out of that object denoted by the lvalue returned by the dereference operation, the compiler has to copy the data the object contains. For a simple POD, that is just a mere memcpy:

a aGlobalA;
void f(a * t) {
    // gets the value of of the object denoted by *t, copying it into aGlobalA
    aGlobalA = *t; 
}

My gcc port outputs this code for f:

    sub     $29, $29, 24       ; subtract stack-pointer, creating this frame
    stw     $31, $29, 20       ; save return address
    add     $5, $0, $4         ; copy pointer t into $5 (src)
    add     $4, $0, aGlobalA   ; load address of aGlobalA into $4 (dst)
    add     $6, $0, 168        ; put size (168 bytes) as 3rd argument
    jal     memcpy             ; call memcpy
    ldw     $31, $29, 20       ; restore return address
    add     $29, $29, 24       ; add stack-pointer, destroying this frame
    jr      $31

Optimized machine code would use in-line code instead of a call to memcpy, but that's really just an implementation detail. What is important is, that merely *t isn't executing any code, but accessing the value of that object actually needs to copy it.

Would we have to do with a type having a user defined copy assignment operator, affairs are more complex:

struct a {
    int big[42];
    void operator=(a const&) { }
};

The code for the same function f now looks like:

    sub     $29, $29, 8
    add     $29, $29, 8
    jr      $31

Hah. But it wasn't such a surprise, wasn't it? After all, the compiler is supposed to call our operator=, and if it does nothing, the whole function also does nothing!

Conclusion

I think the conclusion we can draw is, it all depends on how the returned value of operator* is used. If we have just a pointer that we dereference, we see above that the code generated largely depends on the circumstances. I haven't showed how it behaves if we dereference a class type having overloaded operator* . But essentially, it's just behaving like we saw with operator=. All measurements were done with -O2, so the compiler properly inlined calls :)

Johannes Schaub - litb
There's Jon Skeet for C#, Steven Lott for Python, and litb for C++. +1
Federico Ramponi
"It's as expensive as it gets" means "it's very expensive" which, as the rest of your answer explains, is not really the case.
ChrisN
Oh i like how you really provide ASM for this ( don't like the asm-syntax you've preffered though ). Hehe. +1
Filip Ekberg
thanks, ChrisN, fixed.
Johannes Schaub - litb
Thank you, very complete answer.
tunnuz
+3  A: 

Dereferencing(multiple) cost CPU cycles.

Instead of writing:

string name = first->next->next->next->name;
int age = first->next->next->next->age;

this is O(n)


Write it as:

node* billy_block = first->next->next->next;

string name = billy_block->name;
int age = billy_block->age;

this is O(1)

So your code will not "ask" each and every block just to get to the fourth block.

Multiple dereferencing is like having a neighborhood who only knows a neighbor next to them.

Imagine if you ask a person from the first block where does your friend Billy resides, he will tell you he doesn't know your friend, he'll tell you he only know the neighbor next to them, then he'll just tell you to ask his neighbor, then you'll ask his neighbor, he'll answer the same thing as the first block did, you keep asking until you arrive at your friend's block. Not very efficient

Michael Buen
Thank you for the suggestion.
tunnuz
Any half-decent compiler should optimize the first into the second.
Roddy
We can never be sure, there are some compilers that will just simply do our bidding.
Michael Buen
In this case, compilers will catch the Common SubExpression (CSE). If there's a function call between them, say string name = first->next->next->next->name(), any ->next could change and compilers will quickly give up. That's why getters/setters should be inline.
MSalters
+4  A: 

The most important factor in dereferencing pointers on ordinary systems is that you're likely to generate a cache miss. A random access in SDRAM memory costs tens of nanoseconds (e.g. 64). On gigaherz processors, this means that your processor is idling hundreds (or > thousand) of cycles, without being able of doing anything else in the meantime.

Only on SRAM based systems (which you'll only find in embedded software), or when your software is cache optimized, the factors discussed in the other posts come into play.