At work, we're building a distributed application (possibly across several machines on a LAN, possibly later across several continents on a WAN+VPN). We don't want log files local to each machine (filling up its disk and impossible to view in aggregate) so we need to centralize logging over the network. Most logs won't be important, so UDP is fine for them, but some are money-losing important alerts and must be reliably delivered, implying TCP. We're worried about congesting the network if the logging protocol is too chatty, or dragging the apps to a crawl if it isn't responsive.
Some possibilities I've considered are:
- syslog (it seems perfect, but my boss has an animus against this so I may not be able to choose it).
- scribe from facebook (but it seems a bit heavyweight with a server on every machine - not every log message needs ultra-reliability).
- using a message queue like rabbitmq which can have multiple queues tuned to different levels of transaction safety.
- worst case, I can write my own from scratch.
Do you have other suggestions? What centralized logging solutions have you used, and how well did they work out?
Edit: I was leaning towards scribe, because its store-and-forward design decouples the running app from network latency. But after struggling to install it, I found that (1) it's not available as a binary package - nowadays that's unforgivable - and (2) it depends intimately on a library (thrift) that isn't available as a binary package either! And worst of all, it wouldn't even compile properly. That is not release quality code, even in open source.