ansaurus

Question

C++ performance of accessing member variables versus local variables

Answer 1

+2 A:

This should be your compilers problem. Instead, optimize for maintainability: If the information is only ever used locally, store it in local (automatic) variables. I hate reading classes littered with member variables that don't actually tell me anything about the class itself, but only some details about how a bunch of methods work together :(

In fact, I would be surprised if local variables aren't faster anyway - they are bound to be in cache, since they are close to the rest of the functions data (call frame) and an objects pointer might be somewhere totally else - but I am just guessing here.

Daren Thomas 2008-10-26 20:15:11

Answer 2

A:

When in doubt, benchmark and see for yourself. And make sure it makes a difference first - hundreds of times a second isn't a huge burden on a modern processor.

That said, I don't think there will be any difference. Both will be constant offsets from a pointer, the locals will be from the stack pointer and the members will be from the "this" pointer.

Mark Ransom 2008-10-26 20:15:20

Answer 3

A:

Using the member variables should be marginally faster since they only have to be allocated once (when the object is constructed) instead of every time the callback is invoked. But in comparison to the rest of the work you're probably doing I expect this would be a very tiny percentage. Benckmark both and see which is faster.

Andrew Medico 2008-10-26 20:15:37

Answer 4

+4 A:

I'd prefer the local variables on general principles, because they minimize evil mutable state in your program. As for performance, your profiler will tell you all you need to know. Locals should be faster for ints and perhaps other builtins, because they can be put in registers.

fizzer 2008-10-26 20:16:00

Answer 5

+1 A:

In my oppinion, it should not impact performance, because:

In Your first example, the variables are accessed via a lookup on the stack, e.g. [ESP]+4 which means current end of stack plus four bytes.
In the second example, the variables are accessed via a lookup relative to this (remember, varB equals to this->varB). This is a similar machine instruction.

Therefore, there is not much of a difference.

However, You should avoid copying the string ;)

Black 2008-10-26 20:16:25

Answer 6

+5 A:

Silly question.
It all depends on the compiler and what it does for optimization.

Even if it did work what have you gained? Way to obfuscate your code?

Variable access is usually done via a pointer and and offset.

Pointer to Object + offset
Pointer to Stack Frame + offset

Also don't forget to add in the cost of moving the variables to local storage and then copying the results back. All of which could be meaning less as the compiler may be smart enough to optimize most of it away anyway.

Martin York 2008-10-26 20:16:27

What do you mean the cost of moving variables to local storage then copying the results back? That's part of the question... is there any performance gain in copying values to member variables rather than local variables?

2008-10-26 22:17:32

Answer 7

A:

Also, there's a third option: static locals. These don't get re-allocated every time the function is called (in fact, they get preserved across calls) but they don't pollute the class with excessive member variables.

Andrew Medico 2008-10-26 20:18:26

He'd still have to initialize them each time to get the same behavior. And "allocation" for local variables amounts to a different baked-in stack pointer increment. So either way, the cost is the cost of initialization.

Shog9 2008-10-26 20:21:31

Answer 8

+16 A:

Executive summary: In virtually all scenarios, it doesn't matter, but there is a slight advantage for local variables.

Warning: You are micro-optimizing. You will end up spending hours trying to understand code that is supposed to win a nanosecond.

Warning: In your scenario, performance shouldn't be the question, but the role of the variables - are they temporary, or state of thisClass?

Warning: First, second and last rule of optimization: measure!

First of all, look at the typical assembly generated for x86 (your platform may vary):

// stack variable: load into eax
mov eax, [esp+10]

// member variable: load into eax
mov ecx, [adress of object]
mov eax, [ecx+4]

Once the address of the object is loaded, int a register, the instructions are identical. Loading the object address can usually be paired with an earlier instruction and doesn't hit execution time.

But this means the ecx register isn't available for other optimizations. However, modern CPUs do some intense trickery to make that less of an issue.

Also, when accessing many objects this may cost you extra. However, this is less than one cycle average, and there are often more opprtunities for pairing instructions.

Memory locality: here's a chance for the stack to win big time. Top of stack is virtually always in the L1 cache, so the load takes one cycle. The object is more likely to be pushed back to L2 cache (rule of thumb, 10 cycles) or main memory (100 cycles).

However, you pay this only for the first access. if all you have is a single access, the 10 or 100 cycles are unnoticable. if you have thousands of accesses, the object data will be in L1 cache, too.

In summary, the gain is so small that it virtually never makes sense to copy member variables into locals to achieve better performance.

peterchen 2008-10-26 20:28:52

C++ does not have a specific ABI. But the ones I have read always reserve one for the this pointer. So no gain. Plus you forget the cost copying in-to/out-of locals.

Martin York 2008-10-26 21:17:46

you are right, these things should be mentioned, too.

peterchen 2008-10-26 21:23:44

Since class data is placed consecutively into memory, the first time you access any member data, almost all of the class's state should be loaded as a cache line into L1. After that, the accesses should be identical.

AndreasT 2009-08-27 08:00:41

as said, "However, you pay this only for the first access". ;-)

peterchen 2009-08-27 08:37:37

Though you're not taking into account the object may not currently be loaded into the CPU cache. Access that memory may cause a page fault that would require the memory to be fetched from Ram. So even though there may be hardly any difference, depending on the cache hit miss it can turn out more significant.

Chad 2009-11-23 01:45:32

@Chad - *"The object is more likely to be pushed back to L2 cache (rule of thumb, 10 cycles) or main memory (100 cycles).* ***However, you pay this only for the first access.**..."*

peterchen 2009-11-23 08:23:12

Answer 9

+1 A:

The amount of data that you will be interacting with will have a bigger influence on the execution speed than the way you represent the data in the implementation of the algorithm.

The processor does not really care if the data is on the stack or on the heap (apart from the chance that the top of the stack will be in the processor cache as peterchen mentioned) but for maximum speed, the data will have to fit into the processor's cache (L1 cache if you have more than one level of cache, which pretty much all modern processors have). Any load from L2 cache - or $DEITY forbid, main memory - will slow down the execution. So if you're processing a string that's a few hundred KB in size and chances on every invocation, the difference will not even be measurable.

Keep in mind that in most cases, a 10% speedup in a program is pretty much undetectable to the end user (unless you manage to reduce the runtime of your overnight batch from 25h back to less than 24h) so this is not worth fretting over unless you are sure and have the profiler output to back up that this particular piece of code is within the 10%-20% 'hot area' that has a major influence over your program's runtime.

Other considerations should be more important, like maintainability or other external factors. For example if the above code is in heavily multithreaded code, using local variables can make the implementation easier.

Timo Geusch 2008-10-26 20:41:30

Answer 10

+1 A:

It depends, but I expect there would be absolutely no difference.

What is important is this: Using member variables as temporaries will make your code non-reentrant - For example, it will fail if two threads try to call callback() on the same object. Using static locals (or static member variables) is even worse, because then your code will fail if two threads try to call callback() on any thisClass object - or descendant.

Roddy 2008-10-26 20:45:13

Re-entrancy != concurrency. You make two true statements connected by a misleading "i.e.". Non-re-entrant code can go wrong even in single-threaded code, for instance if the callback calls something that calls it again, or if it's called from a signal handler that has interrupted it. +1 anyway.

Steve Jessop 2008-10-30 03:29:38

The problem with conflating the two is that sometimes naive people think that by adding locks, or only having one thread, they can make their code re-entrancy-safe. Not so: this only makes it concurrency safe, and there are other causes for re-entrancy.

Steve Jessop 2008-10-30 03:31:38

"Re-entrancy != concurrency". You're right - thanks. I've changed 'i.e' to 'for example'.

Roddy 2008-10-30 14:33:35

ansaurus

tags:

views:

answers:

C++ performance of accessing member variables versus local variables

related questions