tags:

views:

39

answers:

5

I am building previously working code, but I am getting a seg fault and I can't figure out what went wrong. gdb catches the error, but it doesn't point to an obvious cause. The source line it shows is a function name, so it doesn't even get into the function. If I look at the dissasembly of the instruction it is still setting up the stack, so maybe the stack is messed up. So how should I go about debugging this? This is in QNX 6.2, console gdb only.

0x0816b829 in __ml (this=0x79b963c, anMultiplier=0) at ../u_matrix.cpp:56
56      tcMatrix tcMatrix::operator*(float64 anMultiplier)

0x816b820 <__ml>:       push   %ebp
0x816b821 <__ml+1>:     mov    %esp,%ebp
0x816b823 <__ml+3>:     sub    $0x13ac,%esp
0x816b829 <__ml+9>:     push   %edi
0x816b82a <__ml+10>:    push   %esi
0x816b82b <__ml+11>:    push   %ebx 
A: 

Anything relevant if doing a "bt" in gdb?

Unknown
A: 

You could also try valgrind'ing it, which can give more info.

Zev
Since when does Valgrind support QNX?
Employed Russian
whoops, my mistake, sorry about that
Zev
A: 

"this" pointer looks messed up - 0x79b963c seems to be off but it is possible depending on how objects are initialized. Try

print *this

and see if data makes sense or is garbage. It also looks like your source doesn't match the executable - the line you have in the example looks like an operator override declaration and not something executable.

I would ignore the particular line, look for the whole _ml function in the source and try printing a few local variables to see if maybe you are within a loop or some other scope that would have them.

I am guessing you have a matrix multiplication operator where a matrix is multiplied by a float - most likely this is something like index out of bounds, off-by-one problem of some sort where you ran outside of the memory scope and corrupted the stack.

like Unknown said, try bt as well - if it comes back with a lot of ??()'s then you do have a corrupt stack.

m1tk4
+1  A: 

The instruction you are crashing on is push %edi.

This most likely means that you have a stack overflow.

One likely cause of stack overflow is infinite recursion. If (gdb) where shows unending stream of function calls, that's your problem.

If where shows reasonable sequence of calls, execute info frame and up repeatedly, looking for frames with unreasonably large size.

Finally, the problem may be caused by a change in your execution environment, and not by anything in your program. I am not sure what QNX equivalent of ulimit -s is, but it's possible that your stack limit is simply too small.

Employed Russian
+1  A: 

Following Employed Russian's answer:

ulimit -s works on QNX but it is unlimited by default.

I would experiment with

ldrel -S2M -L yourexecutablename

to adjust the initial stack allocation / laziness to see if coredumps reoccur. You can also use QCC's -N flag to set the initial stack size higher.

m1tk4