Need help with buffer overrun.

+2 A:

While it won't help you in Windows, Valgrind is by far the best tool for detecting bad memory behavior.

If you are debugging the stack, your need to get to low level tools - place a canary in the stack frame (perhaps a buffer filled with something like 0xA5) around any potential suspects. Run the program in a debugger and see which canaries are no longer the right size and contain the right contents. You will gobble up a large chunk of stack doing this, but it may help you spot exactly what is occurring.

Yann Ramin 2010-04-30 18:45:19

Yeah, I've used it in the past. While our server code does run on various flavors of Unix (Solaris/HP/AIX), it doesn't look like Valgrind is supported there, so unfortunately, it doesn't quite help me here.

Morinar 2010-04-30 18:48:47

A:

Wrap it in an exception handler and dump out useful information when it occurs.

Peter 2010-04-30 18:47:06

+1 A:

You could try putting some local variables on either end of the buffer, or even sentinels into the (slightly expanded) buffer itself, and trigger a breakpoint if those values aren't what you think they should be. Obviously, using a pattern that is not likely in the data would be a good idea.

dash-tom-bang 2010-04-30 18:52:57

I put in a handful of local variable buffers hoping to catch some value and the issue hasn't reproduced in 25 or so tries (which is 2-3x more than I've ever gone before). It's like the buffers I added padded everything just enough so that nothing ever crashed. Even when I was debugging into them, the buffers held the exact values I would expect them to right before returning from the function every time.

Morinar 2010-04-30 21:42:00

What if you just expand your buffer, and write some known values to the end of it?

dash-tom-bang 2010-04-30 21:44:59

You say that as if I know which buffer I'm overwriting. If that wasn't clear, I have absolutely no idea. If I knew which buffer I was overrunning I'd merely set a hardware breakpoint and win.

Morinar 2010-04-30 22:26:26

ha hah oh I see. I assumed it was one in particular, sorry. Usually in the case of stack stomping your overwrite will only destroy local variables, so if you've got a function with a number of local buffers, sentinels between them may help identify which one is overrunning.

dash-tom-bang 2010-04-30 23:06:23

You could set watchpoints on all your sentinel values too, so you break into the debugger as soon as one of them changes.

caf 2010-05-01 05:19:35

Accepting this as it helped me get to the solution: I ended up tracking this down by putting some local variables around various buffers and figuring out which buffer was overrunning. I then put hardware breakpoints on either side of the buffer and wait for it to reproduce. It was an insidious little bug where we were telling a function an 8 byte buffer was 10 bytes, and it was uppercasing characters (among other things) so only reproduced if those extra two bytes happened to contain lowercase characters. Thanks all for the help!

Morinar 2010-05-04 15:49:57

A:

Does this program recurse at all? If so, I check there to ensure you don't have an infinite recursion bug. If you can't see it manually, sometimes you can catch it in the debugger by pausing frequently and observing the stack.

RickNotFred 2010-04-30 20:28:37

Nope. No recursion.

Morinar 2010-04-30 21:09:01

+1 A:

One thing I have done in the past to help narrow down a mystery bug like this was to create a variable with global visibility named checkpoint. Inside the culprit function, I set checkpoint = 0; as the very first line. Then, I added ++checkpoint; statements before and after function calls or memory operations that I even remotely suspected might be able to cause an out-of-bounds memory reference (plus peppering the rest of the code so that I had a checkpoint at least every 10 lines or so). When your program crashes, the value of checkpoint will narrow down the range you need to focus on to a handful of lines of code. This may be a bit overkill, I do this sort of thing on embedded systems (where tools like valgrind can't be used) but it should still be useful.

bta 2010-04-30 20:47:28

Great idea! Will try that next.

Morinar 2010-04-30 21:05:29

I had one practically every other line... the value of it when it crashes was EXACTLY what it should have been. :-p

Morinar 2010-04-30 21:40:21

I don't understand what you mean. If `checkpoint` was 6 (for instance) when the program crashed, then your problem happened between the sixth and seventh `++checkpoint` statement. If you are able to read this value after the crash, it should pinpoint the source of your problem.

bta 2010-04-30 21:55:35

The crash happens upon exiting from a function. With the function itself full of increments, it crashed as one would expect right after hitting all of them.

Morinar 2010-04-30 22:25:18

If the error message is complaining about a buffer overrun and the crash happens upon returning from a function, then it sounds like you most likely have some code that is corrupting the function call stack. Either the stored value of the address to return to or the previous stack frame's cached register values have been corrupted. This might not be easy to track down. Try taking your crashing function and breaking it up into smaller sub-functions. If one of them crashes when it returns, it might give us some hints.

bta 2010-05-03 16:45:32

Also, you may want to look at the raw call stack in a debugger and see if the data for the previous stack frame looks familiar (like data that may have been read out of the database, for instance).

bta 2010-05-03 16:47:28

ansaurus

tags:

views:

answers:

Need help with buffer overrun.

related questions