We have developed PaaS solution for PHP. As part of that we offer developers to see Apache error_log and access_log files through our API.
Currently we write the logs into files on disk seperated per deployment (vhost).
Since this doesn't scale too well with a higher number of nodes and deployments, even though files are on distributed filesystem (GlusterFS), we would like to switch to something better.
Especially for billing and statistical reasons we would prefer not to parse log files every time.
As MongoDBs copped collections look awesome for logging we wanted to go with that. But turns out they don't seem to work with auto sharding which kind of spoils the point for us since we expect much more writes then reads.
The other option was Cassandra which I like for it's every node is equal approach, but they don't have something like capped collections.
Turns out neither of the two solutions offers a distinct feature that helps me make a decision, or I don't see it.
So what I'd want to know is has anybody used one of the two systems for logging before? What are your experiences, can you give me some tips? Or are there other solutions that fit our needs better?