When using a log facility, what are the common "rules of thumb"? E.g.
- Rate limit message X to Y messages per unit of time Z?
- Wait for a recent success message of type T before logging a "new" failure message of the same type?
When using a log facility, what are the common "rules of thumb"? E.g.
If you have to discard messages, discard the unimportant ones.
If you're displaying an important message, don't bury it in a flood of unimportant ones.
Make it very cheap to not display a message when that level of messaging is disabled/not needed.
Make it possible to discover the current state of the system without having to read every old message.
Manage the size of your log files (e.g. several files instead of one file of infinite size), beware filling the disk.
Consider using a standard output format/medium (for example SNMP, <small>
or the NT event log</small>
), which you can view and manage using fully-featured 3rd-party tools.
Print as much context on failure as you can. Including fullest error message possible. Include exact location in the program, or in the workflow (e.g. "error processing line 10029 of input file" vs. "error processing input file")
When DB query fails, consider printing the query text nicely formatted (e.g. Sybase errors usually contain mangles partial query only)
Use log facility that has nice formatting, including ability to tag INFO/WARN/ERROR (or level of log message), for easy grepping
Use log facility that has decent timestamps ability.
As you noted, consider volume. Throttle or bundle messages.
I agree with Jonathon, more context would be helpful. Some things to think about are:
These are just a few questions to think about.