To clear up a misconception of what a compiler can inline, a good enough compiler can inline function pointers. It can just inline function objects more easily since there is more static information available. E.g., a pointer to a function that takes no parameters and returns a bool is of type bool (*)(), while a functor has an explicit type, namely, the functor, and the template instantiation can statically call the functor operator, rather than having to call through a function pointer.
In practice, though, it's mainly a matter of giving the compiler enough information to optimize effectively.
For example, Visual C++ 2008, given the following code with full optimizations:
#include "stdafx.h"
#include <algorithm>
const char print_me[]= "hello!";
class print_functor
{
public:
void operator()(char c)
{
printf("%c", c);
}
};
void print_function(char c)
{
printf("%c", c);
}
int _tmain(int argc, _TCHAR* argv[])
{
std::for_each(print_me, print_me + sizeof(print_me)/sizeof(print_me[0]), print_functor());
printf("\n");
std::for_each(print_me, print_me + sizeof(print_me)/sizeof(print_me[0]), print_function);
return 0;
}
inlines both std::for_each
calls completely. Incidentally, on the PC, the first for_each has an unnecessary lea ecx, [ecx]
.