views:

1576

answers:

15

Compared to

  • Simple memory access
  • Disk access
  • Memory access on another computer(on the same network)
  • Disk access on another computer(on the same network)

in C++ on windows.

A: 

Depends on what that function does, it would fall 2nd on your list if it were doing logic with objects in memory. Further down the list if it included disk/network access.

Nick Craver
+3  A: 

Compared to a simple memory access - slightly more, negligible really.

Compared to every thing else listed - orders of magnitude less.

This should hold true for just about any language on any OS.

Ferruccio
+8  A: 

A function call is simply a shift of the frame pointer in memory onto the stack and addition of a new frame on top of that. The function parameters are shifted into local registers for use and the stack pointer is advanced to the new top of the stack for execution of the function.

In comparison with time

Function call ~ simple memory access
Function call < Disk Access
Function call < memory access on another computer
Function call < disk access on another computer

jW
Did you really mean "shift of the stack pointer"? "Shift of the instruction pointer" would see more appropriate to me. I would describe it as:IP is pushed onto stackIP is set to the function addressSP is advanced to reserve local variables spaces (may include stack frame creation)
Suma
Thank you for the clarification! I will update my answer for anyone who browses in the future
jW
+3  A: 

In general, a function call is going to be slightly slower than memory access since it in fact has to do multiple memory accesses to perform the call. For example, multiple pushes and pops of the stack are required for most function calls using __stdcall on x86. But if your memory access is to a page that isn't even in the L2 cache, the function call can be much faster if the destination and the stack are all in the CPU's memory caches.

For everything else, a function call is many (many) magnitudes faster.

Torlack
A: 

A function call usually involves merely a couple of memory copies (often into registers, so they should not take up much time) and then a jump operation. This will be slower than a memory access, but faster than any of the other operations mentioned above, because they require communication with other hardware. The same should usually hold true on any OS/language combination.

coppro
A: 

If the function is inlined at compile time, the cost of the function becomes equivelant to 0.

0 of course being, what you would have gotten by not having a function call, ie: inlined it yourself.

This of course sounds excessively obvious when I write it like that.

Kent Fredric
+2  A: 

Hard to answer because there are a lot of factors involved.

First of all, "Simple Memory Access" isn't simple. Since at modern clock speeds, a CPU can add two numbers faster than it get a number from one side of the chip to the other (The speed of light -- It's not just a good idea, it's the LAW)

So, is the function being called inside the CPU memory cache? Is the memory access you're comparing it too?

Then we have the function call will clear the CPU instruction pipeline, which will affect speed in a non-deterministic way.

James Curran
What do you mean, the law?
André Neves
Pipelines should not be flushed on modern CPU when call is made. CPU flushes its pipelines only when it wrongly predicts conditional jump (or when interrupt or similar unpredictable event occurs).
Mladen Jankovic
"The law" : The electrical impulses which make up a value in memory, moving at the speed of light, will take several CPU cycles to move the three inches from one end of the CPU to the other.
James Curran
It's the LAW!The definition of a law is that it is meant to be interpreted. Rules on the other hand are absolute.That is why Rugby has Laws and other sports have Rules.
Martin York
@James: three inches?? I'd like to know what kind of monster CPU you are working with...
Michael Borgwardt
@Michael... OkOkOk... A modern Intel CPU is 1.95"x1.95" meaning the corner to corner distance is only 2.75" --- except the signals aren't moving in a straight line-- but regardless, it's still several cycles at the speed of light.
James Curran
@James: the actual processor die is much smaller than that. The Core i7's die is 263mm^2, which is (assuming a square) about 0.64"x0.64"
Michael Borgwardt
A: 

The cost of actually calling the function, but not executing it in full? or the cost of actually executing the function? simply setting up a function call is not a costly operation (update the PC?). but obviously the cost of a function executing in full depends on what the function is doing.

+2  A: 

Assuming you mean the overhead of the call itself, rather than what the callee might do, it's definitely far, far quicker than all but the "simple" memory access.

It's probably slower than the memory access, but note that since the compiler can do inlining, function call overhead is sometimes zero. Even if not, it's at least possible on some architectures that some calls to code already in the instruction cache could be quicker than accessing main (uncached) memory. It depends how many registers need to be spilled to stack before making the call, and that sort of thing. Consult your compiler and calling convention documentation, although you're unlikely to be able to figure it out faster than disassembling the code emitted.

Also note that "simple" memory access sometimes isn't - if the OS has to bring the page in from disk then you've got a long wait on your hands. The same would be true if you jump into code currently paged out on disk.

If the underlying question is "when should I optimise my code to minimise the total number of function calls made?", then the answer is "very close to never".

Steve Jessop
+7  A: 

relative timings (shouldn't be off by more than a factor of 100 ;-)

  • memory-access in cache = 1
  • function call/return in cache = 2
  • memory-access out of cache = 10 .. 300
  • disk access = 1000 .. 1e8 (amortized depends upon the number of bytes transferred)
    • depending mostly upon seek times
    • the transfer itself can be pretty fast
    • involves at least a few thousand ops, since the user/system threshold must be crossed at least twice; an I/O request must be scheduled, the result must be written back; possibly buffers are allocated...
  • network calls = 1000 .. 1e9 (amortized depends upon the number of bytes transferred)
    • same argument as with disk i/o
    • the raw transfer speed can be quite high, but some process on the other computer must do the actual work
mfx
All agreed, except from what I know memory access out of cache is about 200~300 times cache access.
Asaf R
Changed. The 10... range is to include first-level caches. Also, i don't want to create the impression that a disk access can be as fast as a memory access; the situation is not *that* bad ;-)
mfx
A: 

The cost of a function call depends on the architecture. x86 is considerably slower (a few clocks plus a clock or so per function argument) while 64-bit is much less because most function arguments are passed in registers instead of on the stack.

Dark Shikari
A: 

Function call is actually a copy of parameters onto the stack (multiple memory access), register save, the actual code execution, and finally result copy and and registers restore (the registers save/restore depend on the system).

So.. speaking relatively:

  • Function call > Simple memory access.
  • Function call << Disk access - compared with memory it can be hundreds of times more expensive.
  • Function call << Memory access on another computer - the network bandwidth and protocol are the grand time killers here.
  • Function call <<< Disk access on another computer - all of the above and more :)
Adi
A: 

Only memory access is faster than a function call.

But the call can be avoided if compiler with inline optimization (for GCC compiler(s) and not only it is activated when using level 3 of optimization (-O3) ).

Iulian Şerbănoiu
+1  A: 

Let's not forget that C++ has virtual calls (significantly more expensive, about x10) and on WIndows you can expect VS to inline calls (0 cost by definition, as there is no call left in the binary)

MSalters
A: 

This link comes up a lot in Google. For future reference, I ran a short program in C# on the cost of a function call, and the answer is: "about six times the cost of inline". Below are details, see //Output at the bottom. UPDATE: To better compare apples with apples, I changed Class1.Method to return 'void', as so: public void Method1 () { // return 0; }
Still, inline is faster by 2x: inline (avg): 610 ms; function call (avg): 1380 ms. So the answer, updated, is "about two times".

using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Diagnostics;

namespace FunctionCallCost { class Program { static void Main(string[] args) { Debug.WriteLine("stop1"); int iMax = 100000000; //100M DateTime funcCall1 = DateTime.Now; Stopwatch sw = Stopwatch.StartNew();

        for (int i = 0; i < iMax; i++)
        {
            //gives about 5.94 seconds to do a billion loops, 
          // or 0.594 for 100M, about 6 times faster than
            //the method call.
        }

        sw.Stop(); 

        long iE = sw.ElapsedMilliseconds;

        Debug.WriteLine("elapsed time of main function (ms) is: " + iE.ToString());

        Debug.WriteLine("stop2");

        Class1 myClass1 = new Class1();
        Stopwatch sw2 = Stopwatch.StartNew();
        int dummyI;
        for (int ie = 0; ie < iMax; ie++)
        {
          dummyI =  myClass1.Method1();
        }
        sw2.Stop(); 

        long iE2 = sw2.ElapsedMilliseconds;

        Debug.WriteLine("elapsed time of helper class function (ms) is: " + iE2.ToString());

        Debug.WriteLine("Hi3");


    }
}

// Class 1 here using System; using System.Collections.Generic; using System.Linq; using System.Text;

namespace FunctionCallCost { class Class1 {

    public Class1()
    {
    }

    public int Method1 ()
    {
        return 0;
    }
}

}

// Output: stop1 elapsed time of main function (ms) is: 595 stop2 elapsed time of helper class function (ms) is: 3780

stop1 elapsed time of main function (ms) is: 592 stop2 elapsed time of helper class function (ms) is: 4042

stop1 elapsed time of main function (ms) is: 626 stop2 elapsed time of helper class function (ms) is: 3755