tags:

views:

178

answers:

2

We have a message processing system where low latency is critical. Recently, I found that while we keep a high rate through our system we are seeing some "outliers." (Messages that take much longer then they should) When we removed logging our systems show none of these outliers.

Right now our logging is basically just a wrapped ostream with some logging-level functionality similar to log4j (debug, fatal, debug, ect).

I was wondering, what do others do to manage logging performance, specifically in message processing activities? How do you manage these I/O bound activities? Do you stripe it out? Do you move to databases instead?

Any advice for optimizing logging is appreciated.

Note: I recognize that there might be other problems with our system that causes the outliers, but for the sake of this question I am only interested in logging optimizations, thanks.

Also: Logging is mandatory for our system.

+9  A: 

I guess it's OS dependent to some extent.

On win32, our logging subsystem simply queues the messages up for a logging thread which handles the disk I/O.

This decouples disk I/O performance from time-critical threads, and gives us good control over exactly how and when the queue gets locked.

Roddy
Hmmm, we are using Rhel 3
windfinder
+1: The same that popped up into my mind as I read the original question. But I think this is a solution also for other OSses!
rstevens
I'm waiting for the answer to my question before voting on this one: if logging is mandatory, then trusting it to another thread to do "some time later, or perhaps never if we crash first", may or may not be acceptable.
Steve Jessop
Also, if max-latency is to be minimised, perhaps the log should be going to a volume which is not used for the main operations. If they're on the same disk platter (or even the same bus if your throughput is high enough), then even with low process priority you might find that a logging operation occasionally delays "real work". I don't know the likelihood of that, since I've never tested it out.
Steve Jessop
+1  A: 

Similar to what Roddy said, we also queue the messages in a thread-safe queue, and have a separate lower priority thread which does the actual disk I/O.

In the background thread, we also have a limit on the number of messages which can be written at once (dequeued), so for anything more than that we put the background thread back to sleep.

Groo