I know that it's said when passing a variable of any integral type like int, double, long double, etc to a function; it should be done by value but I'm curious that from an assembly point(performance-wise or space-wise), wouldn't be there a situation when passing a variable of an integral type with a size bigger than pointers like long double on my platform which has a size of 8 bytes and has bigger size than pointers that have a size of 4 bytes; by reference would be more efficient?
views:
112answers:
5In general, if the word size of the machine (and thus the pointer size) is less than the size of the integer, then passing by reference would be faster.
For example, on a 32-bit machine, passing a uint64_t
type by reference would be slightly faster than passing by value, since to pass by value involves copying the integer, which requires two register loads. Passing by reference only involves one register load.
Regardless, for the most part it's not likely to make any noticeable performance difference unless you're calling the function like millions of times in a tight loop, in which case the function should probably be inlined if possible.
Passing a pointer/reference to an integer value larger than the native pointer size might well be locally optimal but it's difficult to say if it would be globally optimal. This is largely down to the callee's use of the value. If it's truly an integer and treated as such by the callee it's likely that, at some point, the value is going to be loaded into one or more registers anyway (in order for the program to perform arithmetic on the values, for example) incurring additional overhead in the callee to dereference the pointer. If the callee is inlined by an optimizing compiler it is possible that the compiler will simply pass the integer value split across two registers. If, however, the callee cannot be inlined (if it's third-party API code, for example) then the compiler cannot perform this kind of inlining and indeed passing a pointer might be more efficient, though it's unlikely you'll find library that functions that take an integer pass by reference unless it's so that the callee can modify the caller's value: which introduces a whole different set of issues.
More often than not a modern optimizing compiler will make a close to optimal decision taking all of these kinds of things into consideration and it is usually best for the programmer not to try to preempt the compiler with premature optimization. In fact, this may lead to less efficient code.
The most sensible thing to do in the vast majority of cases is to write your code in the way that best communicates your intent (pass-by-value for "value" types unless the argument is - adopting C# terminology - semantically an "out" or "reference" parameter) and worry about efficiency only if there is a clear performance bottleneck.
The simple answer is that de-referencing a value variable is not as efficient because of the extra step involved.
The happy answer is that from an assembly language point of view value and reference are only guidelines which are not enforced by the assembler. For example it is possible to modify a variable passed by value simply by allowing the caller to access the modified variable before stack cleanup up ala cdecl.
If you're passing a value that's only used several function calls deep, then it might be more efficient to pass by reference-to-const-T). If that's the case, though, you're exposing implementation details for the sake of premature "optimization".
I suspect that in the majority of cases, you'll lose significant performance due to the optimizations the compiler can no longer make (because you have an address-taken variable, and the pointer has escaped):
- The variable can't live in a register.
- The variable has to live to the end of the last function in its scope (i.e. it cannot be reused to store another variable).
- The variable can change across function calls, which means the compiler has to forget everything it might have known about it between calls (e.g. it's positive/it's zero).
For example (I'm using pointer syntax to make things more explicit, but the same is true for references):
long long x=0,y=1;
for (int i = 0; i < 10; i++) {
x = f(&x);
g(&x);
y = f(&y);
g(&y);
}
Pretty standard, but f() and g() could be annoying:
long long f(long long * x) {
static long long * old;
if (old) { *old++; *x += *old; }
return ++*x;
}
long long g(long long * x) {
static long long * old;
if (old == x) { abort(); }
printf("%lld\n", *x);
}
You can fix some of the problems by using long long const *
(so the functions can't modify the value, but they can still read from it...).
You can get around these by sticking the function call inside a block and passing a reference to a copy of the variable:
{
long long tmp = x;
x = f(&tmp);
}