views:

1283

answers:

7

My application is suspending on a line of code that appears to have nothing wrong with it, however my IDE appears to be suspending on that line with the error:

gdb/mi (24/03/09 13:36) (Exited. Signal 'SIGSEGV' received. Description: Segmentation fault.)

The line of code simply calls a method which has no code in it. Isn't a segmentation fault when you have a null reference? If so, how can an empty method have a null reference?

This piece of code, seems to be causing the issue:

#include <sys/socket.h>

#define BUFFER_SIZE 256

char *buffer;

buffer = (char*)GetSomePointer()->SomeStackMemoryString.c_str();
int writeResult = write(socketFD, buffer, BUFFER_SIZE);

bzero(buffer, BUFFER_SIZE);
int readResult = read(socketFD, buffer, BUFFER_SIZE);

When the line using the read(...) method is commented out, the problem goes away.

Update:

I have changed the question to point toward the actual problem, and I have removed all the irrelevant code - and I also answered my own question so that people reading this know specifically what the issue is, please read my answer before saying "you're a moron!".

+10  A: 

First, calling a method through a null pointer or reference is strictly speaking undefined behaviour. But it may succeed unless the call is virtual.

Calling virtual methods virtually (through a pointer/reference, not from the derived class with Class::Method() way of invokation) always fails if the reference/pointer is null because virtual calls require access to vtable and accessing the vtable through a null pointer/reference is impossible. So you can't call an empty virtual method through a reference/pointer.

To understand this you need to know more about how code is organized. For every non-inlined method there's a section of code segment containing the machine code implementing the method.

When a call is done non-virtually (either from a derived class or a non-virtual method through a reference/pointer) the compiler knows exactly which method to call (no polymorphism). So it just inserts a call to an exact portion of code and passes this pointer as the first parameter there. In case of calling through null pointer this will be null too, but you don't care if your method is empty.

When a call is done virtually (through a reference/pointer) the compiler doesn't know which exactly method to call, it only knows that there's a table of virtual methods and the address of the table is stored in the object. In order to find what method to call it's necessary to first dereference the pointer/reference, get to the table, get the address of method from it and only then call the method. Reading the table is done in runtime, not during compilation. If the pointer/reference is null you get segmentation fault at this point.

This also explains why virtual calls can't be inlined. The compiler simply has no idea what code to inline when it's looking at the source during compilation.

sharptooth
It sounds like the answer I'm looking for, but I'm not sure I understand the theory behind what you're explaining. Is there a tutorial on this?
nbolton
Added to answer.
sharptooth
Calling a static method through a null pointer is supported and in the standard.
Torlack
+2  A: 

I cannot think of any reason why an empty method on its own would cause such a problem. Without any other context, my first though would be that a problem elsewhere is corrupting your memory and it just so happens to manifest itself in this way here.

We had that kind of a problem before, and I wrote about it in this answer here. That same question has a lot of other good advice in it too which might help you.

Richard Corden
A: 

Are you calling the virtual method from the constructor of a base class? That could be the problem: If you're calling a pure virtual method from class Base in Base's constructor, and it is only actually defined in class Derived, you might end up accessing a vtable record that has not yet been set, because Derived's constructor has not been executed at that point.

Carl Seleborg
No, not the constructor.
nbolton
Is it a call through a base type pointer? Through "this"?
Carl Seleborg
Nope, the virtual method is defined and declared within the class.
nbolton
+1  A: 

Isn't a segmentation fault when you have a null reference?

Possibly, but not necessarily. What causes a segfault is somewhat platform-specific, but it basically means that your program is accessing memory that it shouldn't be. You might want to read the wikipedia article to get a better idea of what it is.

One thing you might check on, does the empty method have a return type? I could be wrong on this, but if it returns an object, I could see how a copy constructor can get called on garbage if the method isn't actually returning an object. This could cause all sorts of wonky behavior.

Do you get the same result if you change its return type to void or you return a value?

Jason Baker
Thanks for your answer. No, it's a void method with no params.
nbolton
+3  A: 

Without code, the best I can do is a wild guess. But here goes:

Your "long-running code" is writing to an invalid pointer. (Either a totally random pointer, or going past the beginning/start of a buffer or array). This happens to be clobbering the virtual function table for your object - either it's overwriting the pointer to the object, or the vptr member of the object, or it's overwriting the actual global virtual function table for that class.

Some things to try:

  • Put a sentinel member in your class. E.g. an int which is initialised to a known pattern (0xdeadbeef or 0xcafebabe are common) in your constructor, and never changed. Before you make the virtual function call, check (assert()) that it still has the right value.
  • Try using a memory debugger. On Linux, options include Electric Fence (efence) or Valgrind.
  • Run your program under a debugger (gdb is fine) and poke around to see what's wrong - either post-mortem after the segfault happens, or by setting a breakpoint just before the place it's going to segfault.
user9876
+3  A: 

Your code is bogus: buffer points to some random piece of memory. I'm not sure why the line with bzero is not failing.

The correct code is:

   char buffer[BUFFER_SIZE];

   bzero(buffer, BUFFER_SIZE);
   int readResult = read(socketFD, buffer, BUFFER_SIZE);

or you can use calloc(1, BUFFER_SIZE) to get some memory allocated (and zeroed out).

florin
Yep, exactly right! The pointer was there since the start, and the code in question was added after (hence why stack allocation wasn't used in the first place).
nbolton
Also, bzero was not failing because of the GetSomePointer()->SomeStackMemoryString line (this was similar to what was in my original code, just added it to my question now). This was the crucial line of code that was stopping me from seeing the more than obvious problem.
nbolton
+1  A: 

The issue is because the buffer variable is using unassigned memory, which causes memory corruption when the read(...) function puts data in buffer.

Normally, bzero would actually cause the segmentation fault, but because a string is being assigned to the memory location, the read function was allowed to write past the allocated memory (causing the leak).

/* this causes *some* memory to be allocated, 
 * tricking bzero(...) to not SIGSEGV */
buffer = (char*)GetSomePointer()->SomeStackMemoryString.c_str();

int writeResult = write(socketFD, buffer, BUFFER_SIZE);

This change solves the memory leak:

#define BUFFER_SIZE 256

// Use memory on the stack, for auto allocation and release.
char buffer[BUFFER_SIZE];

// Don't write to the buffer, just pass in the chars on their own.
string writeString = GetSomePointer()->SomeStackMemoryString;
int writeResult = write(socketFD, writeString.c_str(), writeString.length());

// It's now safe to use the buffer, as stack memory is used.
bzero(buffer, BUFFER_SIZE);
int readResult = read(socketFD, buffer, BUFFER_SIZE);
nbolton