I'm afraid, that the days, when you could determine the cost of a function call in clock cycles, are long gone (for most platforms).
It might still be possible for some simple microcontrollers, but desktop processors are way too advanced for that.
The cost is dependent on many factors. The processor has ability to reorder instructions and execute instructions of your function in parallel and also in parallel with other instructions, even from a different thread (if the core supports simultaneous multithreading). Targets of branches are "guessed" and executed before the condition is fully evaluated. If the "guess" happens to be wrong, computations are discarded. The guess is based on some statistics gathered by the processor. The difference in execution time in case the code is in cache and in case it isn't can be enormous. Even if you were able to determine, that the execution of the function took x
cycles in some case, the next time you run the code you can get y
and y
might be very different than x
.
If you have a simple microcontroller and the documentation says explicitly how many clock cycles each instruction takes, you can take disassembly of your function and add costs of all instructions it consists of.