views:

128

answers:

1

I have a large set of log files that I want to characterize or possibly add some kind of decision tree or some kind of analytics. But I don't know exactly what. What kind of analysis have you done with log files, a lot of log files.

For example, so far I am collecting how many requests are made to a particular page for a given log file.

Servlet = 60 requets Servlet2 = 70 requests, etc.

I guess right there, filter by only the most popular requests. Also, might do something like 60 requests given a 2 hour period. 60 / 160 minutes.

+1  A: 

Deciding what analysis to do depends on what decisions you're trying to make based on that analysis. For example, I currently monitor logs for exceptions reported by our application (all exceptions in the client application are logged with the server) to decide what should be high priority client bugs to investigate. I also use log searching software to monitor for any Exceptions reported by our server software which may need more immediate investigation. On top of the logs generated by everything anyway, I also use some monitoring software to track usage of our web server and database server which records usage stats etc. in a database. The final aim of this is to predict future usage levels and purchase more hardware as appropriate to keep up with demand.

Two (free) tools I've been using are:

Hyperic for monitoring, it's pretty easy to set up and might be able to start logging a lot of data you may be interested in, ie requests per second on a web server.

Splunk for searching log files, it's very easy to get set up and work with and gives you excellent searching capabilities over your log files. If you're working with log files right now and haven't tried out splunk I definitely recommend it. I have noticed a couple of moments of 100% cpu whilst using it on our main production server so stopped running it on that machine recently, just a word of warning.

Not sure what your aim is with this analysis, mine has been very much about looking for any errors I should know about, and planning for future capacity needs. If you're interested in the latter I'd also recommend The Art of Capacity Planning.

Robin