views:

49

answers:

2

Hi,

I need to store logs in a distributed file system.

Let's say that I have many types of logs. Each log type is recorded in file. But this file can be huge, so it must be distributed across many nodes (with replication for data durability).

These files must support append/get operations.

Is there a distributed system that achieves my needs?

Thanks!

A: 

Combine a NAS with a no-sql database like MongoDB and you'll have distributed, large, and fault tolerant.

Of course, without more specific details like how much data, structure of the logs (or lack thereof), etc, it's really hard to recommend a real product.

For example, if by "huge" you really mean 2TB or less, and the data is highly structured, then a regular SQL server in a 2 machine clustered environment for fail over will do just fine.

However, if by "huge" you mean exabyte level or more and/or unstructured data then several large (and very expensive) NAS devices are needed. On which you run a set of no-sql databases that are clustered for fail/over and/or multi-master relationships...

Chris Lively
A: 

I would recommend Flume, a log pulling infrastructure from the folks at Cloudera:

http://github.com/cloudera/flume

You can also try out Scribe from Facebook:

http://github.com/facebook/scribe

Spike Gronim