views:

180

answers:

10

Is it possible to restore the normal execution flow of a C program, after the Segmentation Fault error?

struct A {
    int x;
};
A* a = 0;

a->x = 123; // this is where segmentation violation occurs

// after handling the error I want to get back here:
printf("normal execution");
// the rest of my source code....

I want a mechanism similar to NullPointerException that is present in Java, C# etc.

Note: Please, don't tell me that there is an exception handling mechanism in C++ because I know that, dont' tell me I should check every pointer before assignment etc.

What I really want to achieve is to get back to normal execution flow as in the example above. I know some actions can be undertaken using POSIX signals. How should it look like? Other ideas?

+3  A: 

You can catch segmentation faults using a signal handler, and decide to continue the excecution of the program (at your own risks).

The signal name is SIGSEGV.

You will have to use the sigaction() function, from the signal.h header.

Basically, it works the following way:

struct sigaction sa1;
struct sigaction sa2;

sa1.sa_handler = your_handler_func;
sa1.sa_flags   = 0;
sigemptyset( &sa1.sa_mask );

sigaction( SIGSEGV, &sa1, &sa2 );

Here's the prototype of the handler function:

void your_handler_func( int id );

As you can see, you don't need to return. The program's execution will continue, unless you decide to stop it by yourself from the handler.

Macmade
I know, but how should it look like? simply: void my_handler() { return; }??
MarcAndreson
See the edit : )
Macmade
Simply returning will not help; the same instruction will get executed again and crash again. A more reliable way to do things (but still ugly and not recommended!) is to `longjmp` (or better yet `siglongjump`) out of the signal handler to a known-safe location.
R..
@R. I thought whether the same insn or the next insn will get executed was CPU+OS dependant? At least, it's that way for SIGFPE IIRC.
ninjalj
+1 for R.'s comment
Macmade
@ninjalj, It's waaaaay out there in the realm of undefined behavior, but I was speaking of how it's done in practice.
R..
A: 

Call this, and when a segfault will occur, your code will execute segv_handler and then continue back to where it was.

void segv_handler(int)
{
  // Do what you want here
}

signal(SIGSEGV, segv_handler);
Scharron
can i leave the body empty? will it return back to the place where error occured?
MarcAndreson
I just edited to add comments.Yes it can be empty, and will return to where the exception occured.
Scharron
It will continue back where it was, which is the instruction causing the segfault, not *after* the cause of the segfault. So the segfault will occur again, and your handler will be called again, and so on.
nos
oh yes ... Thus, don't segfault and it should be ok ;-)
Scharron
A: 

In POSIX, your process will get sent SIGSEGV when you do that. The default handler just crashes your program. You can add your own handler using the signal() call. You can implement whatever behaviour you like by handling the signal yourself.

Carl Norum
+3  A: 

"All things are permissible, but not all are beneficial" - typically a segfault is game over for a good reason... A better idea than picking up where it was would be to keep your data persisted (database, or at least a file system) and enable it to pick up where it left off that way. This will give you much better data reliability all around.

glowcoder
While this is true in general, there are some special cases where hacks for handling SIGSEGV are arguably worthwhile. One example I can think of is when an extremely tight, performance-critical loop is using numbers pulled from a potentially-untrusted source to be used as write indices and can't afford to do bounds-checking. As long as you can bound the range of potential out-of-bounds writes, you could `mmap` the data and map read-only pages adjacent to it, and let the cpu catch and report out-of-bound writes for you. Yes it's ugly but I know MPlayer's libmpeg2 variant once did this. :-)
R..
I'll certainly concede that point, but you're right, it is ugly! And if there's a bug it will be painful to fix.
glowcoder
A: 

This glib manual gives you a clear picture of how to write signal handlers.

A signal handler is just a function that you compile together with the rest
of the program. Instead of directly invoking the function, you use signal 
or sigaction to tell the operating system to call it when a signal arrives.
This is known as establishing the handler.

In your case you will have to wait for the SIGSEGV indicating a segmentation fault. The list of other signals can be found here.

Signal handlers are broadly classified into tow categories

  1. You can have the handler function note that the signal arrived by tweaking some global data structures, and then return normally.
  2. You can have the handler function terminate the program or transfer control to a point where it can recover from the situation that caused the signal.

SIGSEGV comes under program error signals

Praveen S
+4  A: 
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <signal.h>
#include <stdlib.h>
#include <ucontext.h>

void safe_func(void)
{
    puts("Safe now ?");
    exit(0); //can't return to main, it's where the segfault occured.
}

void
handler (int cause, siginfo_t * info, void *uap)
{
  //For test. Never ever call stdio functions in a signal handler otherwise*/
  printf ("SIGSEGV raised at address %p\n", info->si_addr);
  ucontext_t *context = uap;
  /*On my particular system, compiled with gcc -O2, the offending instruction
  generated for "*f = 16;" is 6 bytes. Lets try to set the instruction
  pointer to the next instruction (general register 14 is EIP, on linux x86) */
  context->uc_mcontext.gregs[14] += 6; 
  //alternativly, try to jump to a "safe place"
  //context->uc_mcontext.gregs[14] = (unsigned int)safe_func;
}

int
main (int argc, char *argv[])
{
  struct sigaction sa;
  sa.sa_sigaction = handler;
  int *f = NULL;
  sigemptyset (&sa.sa_mask);
  sa.sa_flags = SA_SIGINFO;
  if (sigaction (SIGSEGV, &sa, 0)) {
      perror ("sigaction");
      exit(1);
  }
  //cause a segfault
  *f = 16; 
  puts("Still Alive");
  return 0;
}

$ ./a.out
SIGSEGV raised at address (nil)
Still Alive

I would beat someone with a bat if I saw something like this in production code though, it's an ugly, for-fun hack. You'll have no idea if the segfault have corrupted some of your data, you'll have no sane way of recovering and know that everything is Ok now, there's no portable way of doing this. The only mildly sane thing you could do is try to log an error (use write() directly, not any of the stdio functions - they're not signal safe) and perhaps restart the program. For those cases you're much better off writing a superwisor process that monitors a child process exit, logs it and starts a new child process.

nos
+1  A: 

See R.'s comment to MacMade answer.

Expanding on what he said, (after handling SIGSEV, or, for that case, SIGFPE, the CPU+OS can return you to the offending insn) here is a test I have for division by zero handling:

#include <stdio.h>
#include <limits.h>
#include <string.h>
#include <signal.h>
#include <setjmp.h>

static jmp_buf  context;

static void sig_handler(int signo)
{
    /* XXX: don't do this, not reentrant */
    printf("Got SIGFPE\n");

    /* avoid infinite loop */
    longjmp(context, 1);
}

int main()
{
    int a;
    struct sigaction sa;

    memset(&sa, 0, sizeof(struct sigaction));
    sa.sa_handler = sig_handler;
    sa.sa_flags = SA_RESTART;
    sigaction(SIGFPE, &sa, NULL);

    if (setjmp(context)) {
            /* If this one was on setjmp's block,
             * it would need to be volatile, to
             * make sure the compiler reloads it.
             */
            sigset_t ss;

            /* Make sure to unblock SIGFPE, according to POSIX it
             * gets blocked when calling its signal handler.
             * sigsetjmp()/siglongjmp would make this unnecessary.
             */
            sigemptyset(&ss);
            sigaddset(&ss, SIGFPE);
            sigprocmask(SIG_UNBLOCK, &ss, NULL);

            goto skip;
    }

    a = 10 / 0;
skip:
    printf("Exiting\n");

    return 0;
}
ninjalj
`sigemptyset`, `sigprocmask`, etc. are not ISO C either. As soon as you get into fancy signal tricks you're way outside the realm of plain C and into POSIX so you might as well go ahead and use `sigsetjmp` and save yourself the trouble.
R..
+2  A: 

No, it's not possible, in any logical sense, to restore normal execution following a segmentation fault. Your program just tried to dereference a null pointer. How are you going to carry on as normal if something your program expects to be there isn't? It's a programming bug, the only safe thing to do is to exit.

Consider some of the possible causes of a segmentation fault:

  • you forgot to assign a legitimate value to a pointer
  • a pointer has been overwritten possibly because you are accessing heap memory you have freed
  • a bug has corrupted the heap
  • a bug has corrupted the stack
  • a malicious third party is attempting a buffer overflow exploit
  • malloc returned null because you have run out of memory

Only in the first case is there any kind of reasonable expectation that you might be able to carry on

If you have a pointer that you want to dereference but it might legitimately be null, you must test it before attempting the dereference. I know you don't want me to tell you that, but it's the right answer, so tough.

Edit: here's an example to show why you definitely do not want to carry on with the next instruction after dereferencing a null pointer:

void foobarMyProcess(struct SomeStruct* structPtr)
{
    char* aBuffer = structPtr->aBigBufferWithLotsOfSpace; // if structPtr is NULL, will SIGSEGV
    //
    // if you SIGSEGV and come back to here, at this point aBuffer contains whatever garbage was in memory at the point
    // where the stack frame was created
    //
    strcpy(aBuffer, "Some longish string");  // You've just written the string to some random location in your address space
                                             // good luck with that!

}
JeremyP
+1 for calling the question out on wanting to do the wrong thing. :-)
R..
thanks for the example, but will this copy operation be performed? I claim the OS will not allow this and send SIGSEGV instead. what is more, assignment is not the only operation that violates memory. reading memory does not cause such threat.
MarcAndreson
@MarcAnderson: that is not true. Read operations can also cause SIGSEGV.
Michael Foukarakis
did I say they don't cause SIGSEGV? The threat I mentioned applies to overwriting invalid memory location.
MarcAndreson
@MarcAnderson: the copy operation may or may not be performed. aBuffer will contain a random value after recovering from the first SIGSEGV. If that random value points to protected memory e.g. is 0 or points into a read only segment, it will raise another SIGSEGV. If, however it happens to point into the stack, you'll overwrite a stack frame or two or if it points into the heap, you'll corrupt the heap.
JeremyP
In fact, the worst case scenario is if it points into the middle of some data e.g. the text of a document, in which case it will just alter that data without necessarily flagging any kind of error.
JeremyP
A: 

You can use the SetUnhandledExceptionFilter() function (in windows), but even to be able to skip the "illegal" instruction you will need to be able to decode some assembler opcodes. And, as glowcoder said, even if it would "comment out" in runtime the instructions that generates segfaults, what will be left from the original program logic (if it may be called so)? Everything is possible, but it doesn't mean that it has to be done.

ruslik
+1  A: 

There is no meaningful way to recover from a SIGSEGV unless you know EXACTLY what caused it, and there's no way to do that in standard C. It may be possible (conceivably) in an instrumented environment, like a C-VM (?). The same is true for all program error signals; if you try to block/ignore them, or establish handlers that return normally, your program will probably break horribly when they happen unless perhaps they're generated by raise or kill.

Just do yourself a favour and take error cases into account.

Michael Foukarakis