views:

249

answers:

4

I have a simple recursive function RCompare() that calls a more complex function Compare() which returns before the recursive call. Each recursion level uses 248 bytes of stack space which seems like way more than it should. Here is the recursive function:

void CMList::RCompare(MP n1) // RECURSIVE and Looping compare function
{
  auto MP ne=n1->mf;
  while(StkAvl() && Compare(n1=ne->mb))
    RCompare(n1); // Recursive call !
}

StkAvl() is a simple stack space check function that compares the address of an auto variable to the value of an address near the end of the stack stored in a static variable.

It seems to me that the only things added to the stack in each recursion are two pointer variables (MP is a pointer to a structure) and the stuff that one function call stores, a few saved registers, base pointer, return address, etc., all 32-bit (4 byte) values. There's no way that is 248 bytes is it?

I don't no how to actually look at the stack in a meaningful way in Visual Studio 2008.

Thanks


Added disassembly:

CMList::RCompare:
0043E000  push        ebp  
0043E001  mov         ebp,esp 
0043E003  sub         esp,0E4h 
0043E009  push        ebx  
0043E00A  push        esi  
0043E00B  push        edi  
0043E00C  push        ecx  
0043E00D  lea         edi,[ebp-0E4h] 
0043E013  mov         ecx,39h 
0043E018  mov         eax,0CCCCCCCCh 
0043E01D  rep stos    dword ptr es:[edi] 
0043E01F  pop         ecx  
0043E020  mov         dword ptr [ebp-8],edx 
0043E023  mov         dword ptr [ebp-14h],ecx 
0043E026  mov         eax,dword ptr [n1] 
0043E029  mov         ecx,dword ptr [eax+20h] 
0043E02C  mov         dword ptr [ne],ecx 
0043E02F  mov         ecx,dword ptr [this] 
0043E032  call        CMList::StkAvl (41D46Fh) 
0043E037  test        eax,eax 
0043E039  je          CMList::RCompare+63h (43E063h) 
0043E03B  mov         eax,dword ptr [ne] 
0043E03E  mov         ecx,dword ptr [eax+1Ch] 
0043E041  mov         dword ptr [n1],ecx 
0043E044  mov         edx,dword ptr [n1] 
0043E047  mov         ecx,dword ptr [this] 
0043E04A  call        CMList::Compare (41DA05h) 
0043E04F  movzx       edx,al 
0043E052  test        edx,edx 
0043E054  je          CMList::RCompare+63h (43E063h) 
0043E056  mov         edx,dword ptr [n1] 
0043E059  mov         ecx,dword ptr [this] 
0043E05C  call        CMList::RCompare (41EC9Dh) 
0043E061  jmp         CMList::RCompare+2Fh (43E02Fh) 
0043E063  pop         edi  
0043E064  pop         esi  
0043E065  pop         ebx  
0043E066  add         esp,0E4h 
0043E06C  cmp         ebp,esp 
0043E06E  call        @ILT+5295(__RTC_CheckEsp) (41E4B4h) 
0043E073  mov         esp,ebp 
0043E075  pop         ebp  
0043E076  ret              

Why 0E4h?


More Info:

class mch // match node structure
{
public:
    T_FSZ c1,c2;   // file indexes
    T_MSZ sz;      // match size
    enum ntyp typ; // type of node
    mch *mb,*mf;   // pointers to next and previous match nodes
};

typedef mch * MP; // for use in casting (MP) x

Should be a plain old pointer right? The same pointers are in the structure itself and they are just normal 4 byte pointers.


Edit: Added:

#pragma check_stack(off)
void CMList::RCompare(MP n1) // RECURSIVE and Looping compare function
{
  auto MP ne=n1->mf;
  while(StkAvl() && Compare(n1=ne->mb))
    RCompare(n1); // Recursive call !
} // end RCompare()
#pragma check_stack()

But it didn't change anything. :(

Now what?

A: 

That also depends on the compiler and the architecture you're running - e.g. it could be aligning to 256 bytes for faster execution, so that each level uses the 8 bytes of the variable + 248 padding.

Ofir
That makes no sense because 248 is not a power of two. If that were the case it should use exactly 256 per call. It is exactly 248 per call NOT 8+248.
Harvey
A: 

In Visual Studio you can look at the register "esp", the stack pointer, in a watch (or register) windows. Set a breakpoint in your function between one call and the next to see who much stack you consume.

On a pain function on debug mode in Visual Studio 2008 it is 16 byes per function call.

Arve
I know how much... 248 bytes each recursion. But what for? I would believe 40 or 50.
Harvey
It is 16 bytes in a plain function. It has to be the MP variable
Arve
Sorry "per function". Sometimes the spell checker tricks me to write something strange
Arve
Adding a pointer argument and a local pointer does indeed add 244 bytes per call. (It does this even you you use char-pointers). Strange
Arve
See Nick D's answer.
Harvey
+4  A: 

Note that on debug mode the compiler binds many bytes from the stack, on every function,
to catch buffer overflow bugs.

0043E003  sub         esp, 0E4h ; < -- bound 228 bytes
...
0043E00D  lea         edi,[ebp-0E4h] 
0043E013  mov         ecx, 39h 
0043E018  mov         eax, 0CCCCCCCCh ; <-- sentinel
0043E01D  rep stos    dword ptr es:[edi] ; <-- write sentinels

Edit: the OP Harvey found the pragma that turns on/off stack probes.

check_stack

Instructs the compiler to turn off stack probes if off (or –) is specified,
or to turn on stack probes if on (or +) is specified.

#pragma check_stack([ {on | off}] )
#pragma check_stack{+ | –}

Update: well, probes is another story, as it appears.
Try this: /GZ (Enable Stack Frame Run-Time Error Checking)

Nick D
+1. depending on the compiler options, it might even do that in a release build
nikie
Ok, that sound like what may be going on. Is there a #pragma to turn that off where I don't want it?
Harvey
@Harvey, I don't know if there is a pragma for that. On a release build it should be removed.
Nick D
Good, now how to get rid of that?
Harvey
@nikie, thanks for the info.
Nick D
Ah! I found:#pragma check_stack([ {on | off}] )#pragma check_stack{+ | –}
Harvey
Thanks all very much!
Harvey
@Harvey, excellent! I'll update my answer.
Nick D
I spoke too soon... :( The #pragmas didn't change a thing. It still uses 248 in Debug mode and only 12 or 16 in Release mode. Now what?
Harvey
@Harvey, see my update.
Nick D
Well, now in Code Generation/Basic Runtime Checks I changed "Both (/RTC1, equiv. to /RTCsu)" to "Uninitialized Variables (/RTCu)" and that reduced the stack usage to about 100 bytes. Well Release uses only 12 or 16 bytes. That is good. I'm going to let this go for now. THANKS ALL!
Harvey
@Harvey, additionally try the `#pragma runtime_checks("s", off)` see http://msdn.microsoft.com/en-us/library/6kasb93x%28VS.80%29.aspx
Nick D
Good! Thanks Nick. That brings it down to 94 bytes per call compared to 12 or 16 in Release build. Good enough for Debug for now.
Harvey
A: 

I guess some space has to be allocated for exception handling. Did you look at the disassembly?

nikie
Added it for you
Harvey