views:

286

answers:

6

I am currently working on a rather large multi tiered app that will be deployed overseas. Although I hope it won't fall over or blow up once depolyed I can't be 100% sure of this. Therefore it would be nice to know that I could request the log file, to work out exactly what went wrong and why.

So basically as the title suggests I would like to know when and what to log? I would like to know this to ensure I have comprehensive log files that can be examined easily to determine what has happened if my app falls over.

+4  A: 

1 - Make a single log, with a standardized format. Doesn't matter much what it is, but ensure that ever entry has the same basic fields. Just calling "printf" probably won't cut it ( substitute System.err.println or whatever as appropriate )

2 - Allow for at least one field to be an arbitrary string... the developer will know better then you what needs to be there.

3 - Include a high resolution time-stamp on each entry. You will need it eventually, trust me.

4 - If possible, include the file and line number of the origin of the error. That's easy in C, and a bit of a pain in Java. But it's incredibly useful later on, especially when people start to cut+paste code, including the error messages.

5 - Ensure the log is at a place that any level of the code can use it.

6 - I've often used "Primary" and "Secondary" error tags, where "Primary" means "I'm the guy who detected there is a problem", and "Secondary" means "I called a function which reported an error". That makes it easy to find the source of the problem ( "Primary: file not found" ) and still report the meaning of the error ( "Secondary: can't load calibration table" ).

7 - Include some capability to log non-errors as well as errors.

The hardest part I find is when an error isn't necessarily an error. If you call a function with a file, and the file doesn't exists, is that an error that should be logged or not? Sometimes it's a critical failure, and sometimes it's expected. It's pretty much up to the API of the function; if the function has a way to return an error, I will usually have it do that without logging; then it's the job of the higher level code to decide if it needs to report that error or if it is expected.

Chris Arguin
A: 

As long as you don't have to pay much for the performance, logging is important.

In my experience the most important things you want to log are those sort of warnings, Oops's, sanity check failures, rainy-day scenarios, etc., that one tends to neglect while coding the sunny-day scenarios and sometimes waives them off with a print "We shouldn't get here", etc. These things have a tendency to not appear during testing but to start popping up during deployment, where they're of course not captured.

If you log and intend to read the results by remote, make sure to capture the exact timestamp, location, and some sort of session ID (in case there are multiple instances running at the same time and writing into the log file). The easier it is for you to determine what messages are part of one execution, the better you are.

Error levels and types are also important. It is also important to do a search to make sure you are not writing the same message from multiple locations or tracing will be difficult.

Finally, be extremely careful about logging errors if your users run Mac OS X: for some strange reason, even in Leopard, the default logging mechanism gets processed expensively and can hog tons of CPU.

Uri
+5  A: 

First off, grab yourself a logging framework - you haven't mentioned any specific language, but any of the frameworks based around the Apache log4j would be a safe bet. The most important thing is that the framework supports different levels of verbosity (debug messages, warnings, error messages). You can configure the logger at run time as to which messages it will actually write, and to where - there's no point re-inventing the wheel to work with logging.

Implement your logging framework in your source. As a minimum, you should be looking to record and then "add value" to exceptions that can occur in your application. Writing a stack trace to a log file is all well and good, but it's very rarely enough to be able to diagnose the problem - consider logging things like the value of method parameters in a catch {}.

At a higher level, you can utilise the power of the different levels of verbosity to record what it happening in your application. This is especially useful if errors are only occurring on production systems where you can't attach a remote debugger - you can just increase the level of verbosity in the log framework config file, and watch as all your debug("Calling method X with parameter Y") messages appear in the log.

iAn
+1  A: 

I would only like to add the small bit that for a large mission-critical application where problems can only be investigated, once deployed, through the help of logs sent through clients, a good sense of when and where to log comes with time as the application matures (where maturity is directly related to the amount of time the application spends deployed and being used at one place, and the number of different deployments of it [at different clients/locations]).

ayaz
A: 

We develop a large telephony-based system which is used all over the world, and have used our own logging system for applications for years. Levels of debug are very important, and our apps ship with the debug set to "errors only", with log to file enabled on all but the most time-sensitive. We also support diverting our output to the debug trace system (this is Windows, so it's a simple call to OutputDebugString, and our engineers have access to a debug catcher called DBWIN32). This is important, because some classes of bugs require you to be able to see the output from multiple apps, serialised. I have solved some seriously tricky multi-app interaction bugs by the application of this technique. Apps usually add a human-readable tag to the output so we can tell which line came from which app, for this scenario.

The levels we use are typically: Off, Errors only, basic, detailed, "verbose" (where verbose is a placeholder implying multiple things like poll results, user operations, message contents etc - whatever the author thinks is important).

Oh, and the first thing an app writes into its log file is a header giving its version resource, so we can tell what build we're dealing with - don't trust the user or the local engineer to know :-)

Bob Moore
A: 

AOP is really useful for non-intrusive logging. For example, you can use AOP to log the parameter values and return value of every method call without actually adding the logging statements to each method.

The specific details of how to do this obviously depend on your target language and platform (which you didn't specify). For an example of how to add such a logger to a Java Spring-based application, see here.

Don