What would be the benefits of inlining different types of function and what are the issues that would I would need to watch out for when developing around them? I am not so useful with a profiler but many different algorithmic applications it seems to increase the speed 8 times over, if you can give any pointers that'd be of great use to me.
Inline functions are oft' overused, and the consequences are significant. Inline indicates to the compiler that a function may be considered for inline expansion. If the compiler chooses to inline a function, the function is not called, but copied into place. The performance gain comes in avoiding the function call, stack frame manipulation, and the function return. The gains can be considerable.
Beware, that they can increase program size. They can increase execution time by reducing the caller's locality of reference. When sizes increase, the caller's inner loop may no longer fit in the processor cache, causing unnecessary cache misses and the consequent performance hit. Inline functions also increase build times - if inline functions change, the world must be recompiled. Some guidelines:
- Avoid inlining functions until profiling indicates which functions could benefit from inline.
- Consider using your compiler's option for auto-inlining after profiling both with and without auto-inlining.
- Only inline functions where the function call overhead is large relative to the function's code. In other words, inlining large functions or functions that call other (possibly inlined) functions is not a good idea.
The most important pointer is that you should in almost all cases let the compiler do its thing and not worry about it.
The compiler is free to perform inline expansion of a function even if you do not declare it inline
, and it is free not to perform inline expansion even if you do declare it inline
. It's entirely up to the compiler, which is okay, because in most cases it knows far better than you do when a function should be expanded inline.
The main benefits of inlining a function are that you remove the calling overhead and allow the compiler to optimize across the call boundaries. Generally, the more freedom you give the optimizer, the better your program will perform.
The downside is that the function no longer exists. A debugger won't be able to tell you're inside of it, and no outside code can call it. You also can't replace its definition at run time, as the function body exists in many different locations.
Furthermore, the size of your binary is increased.
Generally, you should declare a function static
if it has no external callers, rather than marking it inline
. Only let a function be inlined if you're sure there are no negative side effects.
One of the reason the compiler does a better job inlining than the programmer is because the cost/benefit tradeoff is actually decided at the lowest level of machine abstraction: how many assembly instructions make up the function that you want to inline. Consider the ratio between the execution time of a typical non-branching assembly instruction versus a function call. This ratio is predictable to the machine code generator, so that's why the compiler can use that information to guide inlining.
The high level compiler will often try to take care of another opportunity for inlining: when a function B is only called from function A and never called from elsewhere. This inlining is not done for performance reason (assuming A and B are not small functions), but is useful in reducing linking time by reducing the total number of "functions" that need to be generated.
Added examples
An example of where the compiler performs massive inlining (with massive speedup) is in the compilation of the STL containers. The STL container classes are written to be highly generic, and in return each "function" only performs a tiny bit of operation. When inlining is disabled, for example when compiling in debug mode, the speed of STL containers drop considerably.
A second example would be when the callee function contains certain instructions that require the stack to be undisturbed between the caller and callee. This happens with SIMD instructions using intrinsics. Fortunately, the compilers are smart enough to automatically inline these callee functions because they can inspect whether SIMD assembly instructions are emitted and inline them to make sure the stack is undisturbed.
The bottom line
unless you are familiar with low-level profiling and are good at assembly programming/optimization, it is better to let the compiler do the job. The STL is a special case in which it might make sense to enable inlining (with a switch) even in debug mode.
Function call overhead is pretty small. A more significant advantage of inline functions is the ability to use "by reference" variables directly without needing an extra level of pointer indirection. A function which makes heavy use of parameters passed by reference may benefit greatly if its parameters devolve to simple variables or fields.