I recently found a log statement in my projects codebase that says "here i am with search parameter==>===========11/30/2008===1====00:00 AM"
what guidelines do you adhere to for writing good log messages in an application?
I recently found a log statement in my projects codebase that says "here i am with search parameter==>===========11/30/2008===1====00:00 AM"
what guidelines do you adhere to for writing good log messages in an application?
The most useful part in a log statement (except maybe the date & time of course) is a unique ID. All our log entries begin with MSG-xxxxx where xxxxx is a unique integer. This MSG-xxxxx tag is always hard-coded in the source, never in a resource file, so it is easy to find it back. It is also a very easy search key in the documentation.
Also what is useful is to always enclose variable string content in double quotes to easily differentiate from the message itself:
MSG-12345 Tried to open "myfile.txt", returned error code "123 - file not found".
This also helps when you try to extract information from the log using some regular expression.
Do not do String concatenation in logging statements, at least debug statements.
One Java app I was working on a while ago had dismal performance. So we cracked out the profiler to see where the bottlenecks were. Turned out that it was mostly due to String concatenation operations incurred by assembling DEBUG level logging statements that occurred ALL THE TIME inside nested inner loops etc (hah, and to think they got added in the first place to figure out why the performance was so bad!)
don't do this
LOGGER.debug("The variable was " + myVariable + " and we are doing " + foo);
Instead do this
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("put your debug statement here " + foo + " and " + bar);
}
In my logs, I need the date and the time, the process that creates the logs, and the PID. There's nothing worse than looking at logs and wondering whether they came in five minutes ago, or whether they were left behind 5 years ago before your changes. Date and time are important.
When I report errors, I state concisely what was called, what happened, and what error codes came back. If it's errno, I also report back strerror(errno). I'm reading this to find a problem, and usually I'm in a screaming hurry to find the problem. I don't want to look stuff up. I want it to tell me what happened, and I prefer verbose to useless. If I'm debugging, and often even when I'm not, I'll log all the data like the SQL statement or the low-level record, keys, anything that will help me find the problem ASAP.
Sometimes we log insanely, and it can slow the system slightly. But when shit goes down and we get a subpoena and there's a lawsuit on the loose and accusations of baby-killer are being thrown around, we love that we have enough logs on tap to show exactly what happened and why and who caused it. It's Blame Management.
Code variable levels of logging, so you can switch between nothing, basic information on the program flow and insanely detailed information on the inner workings. Then by adjusting the variable that controls logging level you can control how much output you want to receive.
In large projects where you want to be able to receive logging from customers encountering problems it also helps if you can control which parts of your application will log; for example if there is a problems with reading AD user information you can add "AD_LOGGING=EVERYTHING" and receive detailed logginginformation on that without seeing the log information from every other section of the program.
We use "two-dimensional" logging i.e. logging by individual module and within that by level. That gives us real control when trying to debug customer problems.
Each log entry has a unique id plus date plus time plus module plus debug level. The id is reflected (if need be) in the exception / error message displayed to the user. So if the users logs an incident, we can quickly pull up that section of the log and see what happened around that point.
The Debug level messages don't use generalities like "Invalid input", We log the actual values that have caused the problem. If that field has relationships to other fields, we log them as well.
There are 5 aspects which are important for me (starting from less important):
That `log' statement looks more like a trace statement.
Logging: show the normal events and errors
Tracing: show the execution path as well as the all of the logging events
Personally, I want logging to find out what work has been done by my program, and I want trace statements to verify the execution path and for figuring out what went wrong. I'll route tracing statements to a separate file which includes both types of log messages for context.
So you should have 2 log levels at least, and be able to turn off tracing for performance. You should be able to route these event streams to different locations. This allows you to easily keep logs for historical records, while not having them cluttered with debug info that you only want for tracing problems.
A lot of the answers here focus on the fields that you would include in a log message, but I would argue that the placement and logging level of the log calls are more important. If you use log4net you would be able to include/exclude the date stamp at will via config files, but you won't be able to place or remove log statements without recompiling, so it makes sense to think pretty hard about where they go. Beyond the standard fields, such as timestamp and thread ID, etc, you want to know the class and method name that the call was made from. Log4net et al takes care of the class name already if you go by their best practice of naming your logger after the type you are concerned with. Beyond this, I typically include the method name. This is particularly necessary for tracing, but I include it on all my log messages.
Logging:
You want to know enough information about what action is about to be performed to go back and poke around if something goes wrong. Good candidates are message IDs, email addresses, something that uniquely identifies the work item. This sort of message should come as soon as this sort of data is available, so that when you are reading down through a log file, you will see something like 'attempting to do x with y' and then later if we see an exception, we know which work item we need to look at to see why we failed.
Logging exceptions should be paired with an 'attempt' log message, so that the exception message makes sense in context when reading the log. This means thinking about the structure of exception handling. If you are using .net and you only want to log an exception, not handle it, you want to rethrow the same exception, not a new one, so just do 'throw' not 'throw e' where 'e' is your exception instance. Look this one up if it doesn't make sense.
Tracing:
This is actually simpler, usually I'll have a message at the beginning and end of a method of interest, like bookends. In the entry you can print critical method arguments that affect the program flow. Logging a message at the end of a method is kind of optional, usually you will be interested in looking at the trace sort of like a stack trace. You can figure out the execution path without them.
Performance:
For string performance, use the '*Format' methods if you are using log4net or something similar. This internally will use a StringBuilder so that you aren't paying the immutable string penalty all the time. Generally though the key is to be able to turn tracing off for performance, and have logging be terse enough to leave on, even if a log message is expensive. When correctly done, there shouldn't be enough of them to be a problem.