I have a system which is receiving log files from different places through http (>10k producers, 10 logs per day, ~100 lines of text each).
I would like to store them to be able to compute misc. statistics over them nightly , export them (ordered by date of arrival or first line content) ...
My question is : what's the best way to store them ?
- Flat text files (with proper locking), one file per uploaded file, one directory per day/producer
- Flat text files, one (big) file per day for all producers (problem here will be indexing and locking)
- Database Table with text (MySQL is preferred for internal reasons) (pb with DB purge as delete can be very long !)
- Database Table with one record per line of text
- Database with sharding (one table per day), allowing simple data purge. (this is partitioning. However the version of mysql I have access to (ie supported internally) does not support it)
- Document based DB à la couchdb or mongodb (problem could be with indexing / maturity / speed of ingestion)
Any advice ?