views:

1213

answers:

9

I was just wondering how a stack overflow occurs and the best ways to make sure it doesn't happen, or ways to prevent one - particularly on web servers, but other examples would be interesting, as well.

A: 

http://en.wikipedia.org/wiki/Buffer_overflow

Tom Alderman
Buffer overflows are not the same thing...
Jason Bunting
+5  A: 

A stack overflow in real code occurs very rarely. Most situations in which it occurs are recursions where the termination has been forgotten. It might however rarely occur in highly nested structures, e.g. particularly large XML documents. The only real help here is to refactor the code to use an explicit stack object instead of the call stack.

Konrad Rudolph
+2  A: 

Infinite recursion is a common way to get a stack overflow error. To prevent - always make sure there's an exit path that will be hit. :-)

Another way to get a stack overflow (in C/C++, at least) is to declare some enormous variable on the stack.

char hugeArray[100000000];

That'll do it.

Matt Dillard
+3  A: 

Most people will tell you that a stack overflow occurs with recursion without an exit path - while mostly true, if you work with big enough data structures, even a proper recursion exit path won't help you.

Some options in this case:

Greg Hurlman
+2  A: 

Usually a stack overflow is the result of an infinite recursive call (given the usual amount of memory in standard computers nowadays).

When you make a call to a method, function or procedure the "standard" way or making the call consists on:

  1. Pushing the return direction for the call into the stack(that's the next sentence after the call)
  2. Usually the space for the return value get reserved into the stack
  3. Pushing each parameter into the stack (the order diverges and depends on each compiler, also some of them are sometimes stored on the CPU registers for performance improvements)
  4. Making the actual call.

So, usually this takes a few bytes depeding on the number and type of the parameters as well as the machine architecture.

You'll see then that if you start making recursive calls the stack will begin to grow. Now, stack is usually reserved in memory in such a way that it grows in opposite direction to the heap so, given a big number of calls without "coming back" the stack begins to get full.

Now, on older times stack overflow could occur simply because you exausted all available memory, just like that. With the virtual memory model (up to 4GB on a X86 system) that was out of the scope so usually, if you get an stack overflow error, look for an infinite recursive call.

Jorge Córdoba
+1  A: 

A stack overflow occurs when Jeff and Joel want to give the world a better place to get answers to technical questions. It's too late to prevent this stack overflow. That "other site" could have prevented it by not being scuzzy. ;)

Haacked
A: 

Considering this was tagged with "hacking", I suspect the "stack overflow" he's referring to is a call stack overflow, rather than a higher level stack overflow such as those referenced in most other answers here. It doesn't really apply to any managed or interpreted environments such as .NET, Java, Python, Perl, PHP, etc, which web apps are typically written in, so your only risk is the web server itself, which is probably written in C or C++.

Check out this thread:

http://stackoverflow.com/questions/7308/what-is-a-good-starting-point-for-learning-buffer-overflow

Steve M
+11  A: 

Stack

A stack, in this context, is the last in, first out buffer you place data while your program runs. Last in, first out (LIFO) means that the last thing you put in is always the first thing you get back out - if you push 2 items on the stack, 'A' and then 'B', then the first thing you pop off the stack will be 'B', and the next thing is 'A'.

When you call a function in your code, the next instruction after the function call is stored on the stack, and any storage space that might be overwritten by the function call. The function you call might use up more stack for its own local variables. When it's done, it frees up the local variable stack space it used, then returns to the previous function.

Stack overflow

A stack overflow is when you've used up more memory for the stack than your program was supposed to use. In embedded systems you might only have 256 bytes for the stack, and if each function takes up 32 bytes then you can only have function calls 8 deep - function 1 calls function 2 who calls function 3 who calls function 4 .... who calls function 8 who calls function 9, but function 9 overwrites memory outside the stack. This might overwrite memory, code, etc.

Many programmers make this mistake by calling function A that then calls function B, that then calls function C, that then calls function A. It might work most of the time, but just once the wrong input will cause it to go in that circle forever until the computer recognizes that the stack is overblown.

Recursive functions are also a cause for this, but if you're writing recursively (ie, your function calls itself) then you need to be aware of this and use static/global variables to prevent infinite recursion.

Generally, the OS and the programming language you're using manage the stack, and it's out of your hands. You should look at your call graph (a tree structure that shows from your main what each function calls) to see how deep your function calls go, and to detect cycles and recursion that are not intended. Intentional cycles and recursion need to be artificially checked to error out if they call each other too many times.

Beyond good programming practices, static and dynamic testing, there's not much you can do on these high level systems.

Embedded systems

In the embedded world, especially in high reliability code (automotive, aircraft, space) you do extensive code reviews and checking, but you also do the following:

  • Disallow recursion and cycles - enforced by policy and testing
  • Keep code and stack far apart (code in flash, stack in RAM, and never the twain shall meet)
  • Place guard bands around the stack - empty area of memory that you fill with a magic number (usually a software interrupt instruction, but there are many options here), and hundreds or thousands of times a second you look at the guard bands to make sure they haven't been overwritten.
  • Use memory protection (ie, no execute on the stack, no read or write just outside the stack)
  • Interrupts don't call secondary functions - they set flags, copy data, and let the application take care of processing it (otherwise you might get 8 deep in your function call tree, have an interrupt, and then go out another few functions inside the interrupt, causing the blowout). You have several call trees - one for the main processes, and one for each interrupt. If your interrupts can interrupt each other... well, there be dragons...

High-level languages and systems

But in high level languages run on operating systems:

  • Reduce your local variable storage (local variables are stored on the stack - although compilers are pretty smart about this and will sometimes put big locals on the heap if your call tree is shallow)
  • Avoid or strictly limit recursion
  • Don't break your programs up too far into smaller and smaller functions - even without counting local variables each function call consumes as much as 64 bytes on the stack (32 bit processor, saving half the CPU registers, flags, etc)
  • Keep your call tree shallow (similar to the above statement)

Web servers

It depends on the 'sandbox' you have whether you can control or even see the stack. Chances are good you can treat web servers as you would any other high level language and operating system - it's largely out of your hands, but check the language and server stack you're using. It is possible to blow the stack on your SQL server, for instance.

Adam Davis
+1  A: 

What? Nobody has any love for those cased by an infinite loop?

            do
            {

                JeffAtwood.WritesCode();

            } while(StackOverflow.MakingMadBank.Equals(false));

Ian Patrick Hughes