views:

86

answers:

6

I have a server application which I am debugging which basically parses scripts (VBscript, Python, Jscript and SQl) for the application that requests it.

This is a very critical application which, if it crashes causes havoc for a lot of users. The problem I am facing is how to handle exceptions so that the application can continue and the users know if something is wrong in their scripts.

An example: In the SQL scripts the application normally returns a set of values (Date, Number, String and Number). So the scripts have to have a statement at the end as such:

into dtDate, Number, Number, sString. These are values that are built into the application and the server application knows how to interpret these. These fields are treated in the server app as part of an array. The return values should normally be in a specific order as the indexes for these fields into the array are hardcoded inside the server application.

Now when a user writing a script forgets one of these fields, then the last field (normally string) throws an IndexOutofBoundsException.

The question is how does one recover from exceptions of this nature without taking down the application?

Another example is an error in a script for which no error parsing message can be generated. These errors just disappear in the background in the application and eventually cause the server app to crash. The scripts on which it fails don't necessarily fail to execute entirely, but part of it doesn't execute and the other parts do, which makes it look fairly odd to a user.

This server app is a native C++ application and uses COM technologies.

I was wondering if anyone has any ideas on what the best way is to handle exceptions such as the ones described above without crashing the application??

+2  A: 

You can't handle problems like this with exceptions. You could have a top-level catch block that catches the exception and hope that not too much state of the program got irrecoverably munched to try to keep the program alive. Still doesn't make the user happy, that query she is waiting for still doesn't run.

Ensuring that changes don't destabilize a critical business app requires organization. People that sign-off on the changes and verify that they work as intended before it is allowed into production. QA.

Hans Passant
A: 

If you know an operation can throw an exception, then you need to add exception handling to this area.

Basically, you need to write the code in an exception safe manner which usually uses the following guidelines

  • Work on temporary values that can throw exceptions
  • Commit the changes using the temp values after (usually this will not throw an exception)

If an exception is thrown while working on the temp values, nothing gets corrupted and in the exception handling you can manage the situation and recover.

http://www.gotw.ca/gotw/056.htm

http://www.gotw.ca/gotw/082.htm

David
+2  A: 

since you talk about parsing different languages, you probably have something like

class IParser //parser interface
{
  virtual bool Parse( File& fileToParse, String& errMessage ) = 0;
};

class VBParser : public Parser
class SQLParser : public Parser

Suppose the Parse() method throws an exception that is not handled, your entire app crashes. Here's a simplified example how this could be fixed at the application level:

  //somewhere main server code
void ParseFileForClient( File& fileToParse )
{
  try
  {
    String err;
    if( !currentParser->Parse( fileToParse, err ) )
      ReportErrorToUser( err );
    else
      //process parser result
  }
  catch( std::exception& e )
  {
    ReportErrorToUser( FormatExceptionMessage( err ) );
  }
  catch( ... )
  {
    ReportErrorToUser( "parser X threw unknown exception; parsing aborted" );
  }
}
stijn
How does one prevent possible corrupted state in your example?
Tony
one does not.. it's just a simple example that allows the Parse method to throw an expects it does not leave the parser in a corrupt state
stijn
A: 

It really depends on how long it takes to start up your server application. It may be safer to let the application crash and then reload it. Or taking a cue from Chrome browser run different parts of your application in different processes that can crash. If you can safely recover an exception and trust that your application state is ok then fine do it. However catching std::exception and continuing can be risky.

There are simple to complex ways to baby sit processes to make sure if they crash they can be restarted. A couple of tools I use.

bluepill http://asemanfar.com/Bluepill:-a-new-process-monitoring-tool

pacemaker http://www.clusterlabs.org/

bradgonesurfing
Those apps in the links are very interesting, however they're for *nix based machines. My server runs on Windows machines! If you know anything that will reliably do the same on a win machine, please let me know?
Tony
A: 

For simple exceptions that can happen inside your program due to user errors, simply save the state that can be changed, and restore it like this:

SaveStateThatCanBeAlteredByScript();
try {
    LoadScript();
} catch(std::exception& e){
    RestoreSavedState();
    ReportErrorToUser(e);
}
FreeSavedState();

If you want to prevent external code from crashing (possible untrustable code like plugins), you need an IPC scheme. On Windows, I think you can memory map files with OpenFile(). On POSIX-systems you can use sem_open() together with mmap().

Mads Elvheim
A: 

If you have a server. You basically have a main loop that waits for a signal to start up a job. The signal could be nothing and your server just goes through a list of files on the file system or it could be more like a web server where it waits for a connection and executes the script provided on the connection (or any thing like that).

MainLoop()
{
    while(job = jobList.getJob())
    {
         job.execute();
    }
}

To stop the server from crashing because of the scripts you need to encapsulate the external jobs in a protected region.

MainLoop()
{
    // Don't bother to catch exceptions from here.
    // This probably means you have a programming error in the server.
    while(job = jobList.getJob())
    {
        // Catch exception from job.execute()
        // as these exceptions are generally caused by the script.
        try
        {
            job.execute();
        }
        catch(MyServerException const& e)
        {
            // Something went wrong with the server not the script.
            // You need to stop. So let the exception propagate.
            throw;
        }
        catch(std::exception const& e)
        {
            log(job, e.what());
        }
        catch(...)
        {
            log(job, "Unknown exception!");
        }
    }
}

If the server is critical to your operation then just detecting the problem and logging it is not always enough. A badly written server will crash so you want to automate the recovery. So you should write some form of heartbeat processes that checks at regular intervals if the processes has crashed and if it has automatically restart it.

Martin York