views:

362

answers:

4

Hi,

I'm currently working on adding exceptions and exception handling to my OSS application. Exceptions have been the general idea from the start, but I wanted to find a good exception framework and in all honesty, understand C++ exception handling conventions and idioms a bit better before starting to use them. I have a lot of experience with C#/.Net, Python and other languages that use exceptions. I'm no stranger to the idea (but far from a master).

In C# and Python, when an unhandled exception occurs, the user gets a nice stack trace and in general a lot of very useful priceless debugging information. If you're working on an OSS application, having users paste that info into issue reports is... well let's just say I'm finding it difficult to live without that. For this C++ project, I get "The application crashed", or from more informed users, "I did X, Y and Z, and then it crashed". But I want that debugging information too!

I've already (and with great difficulty) made my peace with the fact that I'll never see a cross-platform and cross-compiler way of getting a C++ exception stack trace, but I know I can get the function name and other relevant information.

And now I want that for my unhandled exceptions. I'm using boost::exception, and they have this very nice diagnostic_information thingamajig that can print out the (unmangled) function name, file, line and most importantly, other exception specific information the programmer added to that exception.

Naturally, I'll be handling exceptions inside the code whenever I can, but I'm not that naive to think I won't let a couple slip through (unintentionally, of course).

So what I want to do is wrap my main entry point inside a try block with a catch that creates a special dialog that informs the user that an error has occurred in the application, with more detailed information presented when the user clicks "More" or "Debug info" or whatever. This would contain the string from diagnostic_information. I could then instruct the users to paste this information into issue reports.

But a nagging gut feeling is telling me that wrapping everything in a try block is a really bad idea. Is what I'm about to do stupid? If it is (and even if it's not), what's a better way to achieve what I want?

+3  A: 

Wrapping all your code in one try/catch block is a-ok. It won't slow down the execution of anything inside it, for example. In fact, all my programs have (code similar to) this framework:

int execute(int pArgc, char *pArgv[])
{
    // do stuff
}

int main(int pArgc, char *pArgv[])
{
    // maybe setup some debug stuff,
    // like splitting cerr to log.txt

    try
    {
        return execute(pArgc, pArgv);
    }
    catch (const std::exception& e)
    {
        std::cerr << "Unhandled exception:\n" << e.what() << std::endl;
        // or other methods of displaying an error

        return EXIT_FAILURE;
    }
    catch (...)
    {
        std::cerr << "Unknown exception!" << std::endl;

        return EXIT_FAILURE;
    }
}
GMan
+1, but is it correct to call something not derived from `std::exception` an exception? ;-) (and no, I don't have a better name for thrown int or something equally sharp-edged).
Michael Krelin - hacker
You mean the `catch (...)`? I only have that for completeness, if anything actually entered it I'd be surprised and hunt down who's throwing random stuff.
GMan
GMan, yes, I see your intention, it's just that I wonder if one says "unknown exception" or "something unknown thrown our way" in this case ;-) a matter of terminology.
Michael Krelin - hacker
Oh, oh, I see, fair point. :] How about "Unknown wtf-is-this-doing-here"? :p
GMan
yeah, something along these lines ;-)
Michael Krelin - hacker
+1  A: 

No it's not stupid. It's a very good idea, and it costs virtually nothing at runtime until you hit an unhandled exception, of course.

Be aware that there is already an exception handler wrapping your thread, provided by the OS (and another one by the C-runtime I think). You may need to pass certain exceptions on to these handlers to get correct behavior. In some architectures, accessing mis-aligned data is handled by an exception handler. so you may want to special case EXCEPTION_DATATYPE_MISALIGNMENT and let it pass on to the higher level exception handler.

I include the registers, the app version and build number, the exception type and a stack dump in hex annotated with module names and offsets for hex values that could be addresses to code. Be sure to include the version number and build number/date of your exe.

You can also use VirtualQuery to turn stack values into "ModuleName+Offset" pretty easily. And that, combined with a .MAP file will often tell you exactly where you crashed.

I found that I could train beta testers to send my the text pretty easily, but in the early days what I got was a picture of the error dialog rather than the text. I think that's because a lot of users don't know you can right click on any Edit control to get a menu with "Select All" and "Copy". If I was going to do it again, I would add a button that copied that text to the clipboard so that it can easily be pasted into an email.

Even better if you want to go to the trouble of haveing a 'send error report' button, but just giving users a way to get the text into their own emails gets you most of the way there, and doesn't raise any red flags about "what information am I sharing with them?"

John Knoeller
what's that VirtualQuery you mentioned? Never heard of it, - but it could be interesting for our SPARC platform.
it's a Win32 Api. I dont know the equivalent for Unix systems - sorry.
John Knoeller
@John: are you TJ?
Hans Passant
Yes. who are you?
John Knoeller
Hans. Small world.
Hans Passant
+8  A: 

Putting a try/catch block in main() is okay, it doesn't cause any problems. The program is dead on an unhandled exception anyway. It isn't going to be helpful at all in your quest to get the all-important stack trace though. That info is gonzo when the catch block traps the exception.

Catching a C++ exception won't be very helpful either. The odds that the program dies on a an exception derived from std::exception are pretty slim. Although it could happen. Much more likely in a C/C++ app is death due to hardware exceptions, AccessViolation being numero uno. Trapping those requires the __try and __except keywords in your main() method. Again, very little context is available, you've basically only got an exception code. An AV also tells you which exact memory location caused the exception.

This is not just a cross-platform issue btw, you can't get a good stack trace on any platform. There is no reliable way to walk the stack, there are too many optimizations (like framepointer omission) that make this a perilous journey. It is the C/C++ way: make it as fast as possible, leave no clue what happened when it blows up.

What you need to do is debug these kind of problems the C/C++ way. You need to create a minidump. It is roughly analogous to the "core dump" of old, a snapshot of the process image at the time the exception happens. Back then, you actually got a complete dump of the core. There's been progress, nowadays it is "mini", somewhat necessary because a complete core dump would take close to 2 gigabytes. It actually works pretty well to diagnose the program state.

On Windows, that starts by calling SetUnhandledExceptionFilter(), you provide a callback function pointer to a function that will run when your program dies on an unhandled exception. Any exception, C++ as well as SEH. Your next resource is dbghelp.dll, available in the Debugging Tools for Windows download. It has an entrypoint called MiniDumpWriteDump(), it creates a minidump.

Once you get the file created by MiniDumpWriteDump(), you're pretty golden. You can load the .dmp file in Visual Studio, almost like it's a project. Press F5 and VS grinds away for a while trying to load .pdb files for the DLLs loaded in the process. You'll want to setup the symbol server, that's very important to get good stack traces. If everything works, you'll get a "debug break" at the exact location where the exception was thrown". With a stack trace.

Things you need to do to make this work smoothly:

  • Use a build server to create the binaries. It needs to push the debugging symbols (.pdb files) to a symbol server so they are readily available when you debug the minidump.
  • Configure the debugger so it can find the debugging symbols for all modules. You can get the debugging symbols for Windows from Microsoft, the symbols for your code needs to come from the symbol server mentioned above.
  • Write the code to trap the unhandled exception and create the minidump. I mentioned SetUnhandledExceptionFilter() but the code that creates the minidump should not be in the program that crashed. The odds that it can write the minidump successfully are fairly slim, the state of the program is undetermined. Best thing to do is to run a "guard" process that keeps an eye on a named Mutex. Your exception filter can set the mutex, the guard can create the minidump.
  • Create a way for the minidump to get transferred from the client's machine to yours. We use Amazon's S3 service for that, terabytes at a reasonable rate.
  • Wire the minidump handler into your debug database. We use Jira, it has a web-service that allows us to verify the crash bucket against a database of earlier crashes with the same "signature". When it is unique, or doesn't have enough hits, we ask the crash manager code to upload the minidump to Amazon and create the bug database entry.

Well, that's what I did for the company I work for. Worked out very well, it reduced crash bucket frequency from thousands to dozens. Personal message to the creators of the open source ffdshow component: I hate you with a passion. But you're no longer crashing our app anymore! Buggers.

Hans Passant
That was mean to the ffdshow team, they're doing a very passable job without pay and have lots of grateful users (myself included). Otherwise agree, +1
Anton Tykhyy
Thanks! These kind of "real life" posts rarely attract an up-vote at SO.
Hans Passant
+1 Clear and detailed
John Knoeller
+1 Favorited that question only because of this answer :)
Philip Daubmeier
A: 

In fact, boost::diagnostic_information has been designed specifically to be used in a "global" catch(...) block, to display information about exceptions which should not have reached it. However, note that the string returned by boost::diagnostic_information is NOT user-friendly.

Emil