ansaurus

Question

Handling Exceptions in a critical application that should not crash

Answer 1

+2 A:

You can't handle problems like this with exceptions. You could have a top-level catch block that catches the exception and hope that not too much state of the program got irrecoverably munched to try to keep the program alive. Still doesn't make the user happy, that query she is waiting for still doesn't run.

Ensuring that changes don't destabilize a critical business app requires organization. People that sign-off on the changes and verify that they work as intended before it is allowed into production. QA.

Hans Passant 2010-08-16 15:00:47

Answer 2

A:

If you know an operation can throw an exception, then you need to add exception handling to this area.

Basically, you need to write the code in an exception safe manner which usually uses the following guidelines

Work on temporary values that can throw exceptions
Commit the changes using the temp values after (usually this will not throw an exception)

If an exception is thrown while working on the temp values, nothing gets corrupted and in the exception handling you can manage the situation and recover.

http://www.gotw.ca/gotw/056.htm

http://www.gotw.ca/gotw/082.htm

David 2010-08-16 15:01:48

Answer 3

+2 A:

since you talk about parsing different languages, you probably have something like

class IParser //parser interface
{
  virtual bool Parse( File& fileToParse, String& errMessage ) = 0;
};

class VBParser : public Parser
class SQLParser : public Parser

Suppose the Parse() method throws an exception that is not handled, your entire app crashes. Here's a simplified example how this could be fixed at the application level:

  //somewhere main server code
void ParseFileForClient( File& fileToParse )
{
  try
  {
    String err;
    if( !currentParser->Parse( fileToParse, err ) )
      ReportErrorToUser( err );
    else
      //process parser result
  }
  catch( std::exception& e )
  {
    ReportErrorToUser( FormatExceptionMessage( err ) );
  }
  catch( ... )
  {
    ReportErrorToUser( "parser X threw unknown exception; parsing aborted" );
  }
}

stijn 2010-08-16 15:17:18

How does one prevent possible corrupted state in your example?

Tony 2010-08-16 15:32:36

one does not.. it's just a simple example that allows the Parse method to throw an expects it does not leave the parser in a corrupt state

stijn 2010-08-16 18:32:56

Answer 4

A:

It really depends on how long it takes to start up your server application. It may be safer to let the application crash and then reload it. Or taking a cue from Chrome browser run different parts of your application in different processes that can crash. If you can safely recover an exception and trust that your application state is ok then fine do it. However catching std::exception and continuing can be risky.

There are simple to complex ways to baby sit processes to make sure if they crash they can be restarted. A couple of tools I use.

bluepill http://asemanfar.com/Bluepill:-a-new-process-monitoring-tool

pacemaker http://www.clusterlabs.org/

bradgonesurfing 2010-08-16 15:28:01

Those apps in the links are very interesting, however they're for *nix based machines. My server runs on Windows machines! If you know anything that will reliably do the same on a win machine, please let me know?

Tony 2010-08-17 07:20:03

Answer 5

A:

For simple exceptions that can happen inside your program due to user errors, simply save the state that can be changed, and restore it like this:

SaveStateThatCanBeAlteredByScript();
try {
    LoadScript();
} catch(std::exception& e){
    RestoreSavedState();
    ReportErrorToUser(e);
}
FreeSavedState();

If you want to prevent external code from crashing (possible untrustable code like plugins), you need an IPC scheme. On Windows, I think you can memory map files with OpenFile(). On POSIX-systems you can use sem_open() together with mmap().

Mads Elvheim 2010-08-16 16:26:03

Answer 6

A:

If you have a server. You basically have a main loop that waits for a signal to start up a job. The signal could be nothing and your server just goes through a list of files on the file system or it could be more like a web server where it waits for a connection and executes the script provided on the connection (or any thing like that).

MainLoop()
{
    while(job = jobList.getJob())
    {
         job.execute();
    }
}

To stop the server from crashing because of the scripts you need to encapsulate the external jobs in a protected region.

MainLoop()
{
    // Don't bother to catch exceptions from here.
    // This probably means you have a programming error in the server.
    while(job = jobList.getJob())
    {
        // Catch exception from job.execute()
        // as these exceptions are generally caused by the script.
        try
        {
            job.execute();
        }
        catch(MyServerException const& e)
        {
            // Something went wrong with the server not the script.
            // You need to stop. So let the exception propagate.
            throw;
        }
        catch(std::exception const& e)
        {
            log(job, e.what());
        }
        catch(...)
        {
            log(job, "Unknown exception!");
        }
    }
}

If the server is critical to your operation then just detecting the problem and logging it is not always enough. A badly written server will crash so you want to automate the recovery. So you should write some form of heartbeat processes that checks at regular intervals if the processes has crashed and if it has automatically restart it.

Martin York 2010-08-16 16:56:58

ansaurus

tags:

views:

answers:

Handling Exceptions in a critical application that should not crash

related questions