views:

336

answers:

2

Hi All,

I have a question regarding handling errors in a J2EE application. Our current application is in use by many many users and as a result we get a lot of support tickets. Most of these tickets are user-related but 5-10% are system related exceptions, unhandled errors, etc.

We have the basic exception handling checks in the code (needs works), but from my experience showing a generic message to the user doesn't help the troubleshooting process to be expedited.

What I'm looking for is a recommendation on a good error handling design pattern so that let's consider a scenario:

  1. Code has an error
  2. Error is handled
  3. User is shown a non-technical error message with a specific error code.
  4. Non-tech Support team can use this error code to see the area (page, section, ..) in the application where this happened and what the user might have been doing (information prepopulated by dev team in a customer support reference guide).
  5. Tech support team can use the code to zero right in on the class/JSP etc and line of code which trigged that exception.
  6. We use a logging module where most (not all) Tomcat stdout errors are logged by user session ... to the error code the user will receive we can include the log ID if it exists as well so that the tech team can look at that too.

Basically what I'm interested is to reduce the support analysis-explanation-research cycle and give every department access to the information at their fingertips which can get them started faster for their respective jobs:

  1. Customer support can give a canned explanation for this error code or have alternative steps the user could follow.
  2. Tech team could start troubleshooting the line of code or what the user was doing which triggered that line.

Essential each error code triggers of the required next-steps for each department and in effect reducing the research phase of the issue and moving to the solution phase.

I'm not sure if the above is even a good idea. Any suggestions would be appreciated on a good "design pattern" for such a need. Or if it would even be a good route to look at.

Thanks in advance.

SP

+2  A: 

It sounds like you need to spend some serious time bucketing your support issues for some code triage. My experience has been that you can nearly always create a "top 10 list" of items that are causing 50%+ of the support issues. After you've knocked off the first 10, reexamine the call logs. Data is imperative.

A program dedicated to tackling the support issues and knocking them out of the park should be able to winnow down code issues fairly quickly. After which, you'll be left with human/usability issues that have to be handled with training, experience, or typically some longer term workflow/usability refactoring.

If you're being swamped in errors for which you think you need to develop a list of error codes / conditions, it sounds like your product needs another 6 months of stabilization. You're probably not developing an OS, and for 99% of projects developing an IBMesque black book of error codes is not the model to emulate.

Niniki
A: 

Use log4j to log all errors and exceptions to a logger which emails you the information. Don't worry about user messaging; all they need to know is there was an error and that it was logged.

davetron5000