What's the popular tool people use in Unix to parse/analyze log files? Doing counting, find unique, select/copy certain line which have certain patterns. Please advise some tools or some keyword. Since I believe there must be similar questions asked before, but I don't any idea about the keywords. Thanks.
Take a look at some of the generic log parsers listed here. If you use something like syslog
, you can probably get a custom parser/analyzer too. Otherwise, for trivial searches, any scripting language like perl
, python
or even awk
suffices.
For regular, nightly checking there is logwatch which have several different scripts in /usr/share/logwatch/scripts/services
that check for specific things (like web server stuff, ftp server stuff, sshd related stuff, etc) in syslog. Default install enables most of them, but you are able to enable/disable as you like or even write your own scripts.
For real-time watching there is multitail.
I find it to be a huge failure that many log formats do not separate columns with proper unique field separators. Not because that is best, but because it is the basic premise of unix textutils that operate on table data. Instead they tend to use spaces as separators and quote fields that might contain spaces.
One of the most practical simple changes I made to web log analyzing was to leave the default NCSA log format produced by the nginx web server, to instead use tab as the field separator.
Suddenly I could use all of the primitive unix textutils for quick lookups, but especially awk! Print only lines where the user-agent field contains Googlebot:
awk 'BEGIN {FS="\t"} $7 ~ /Googlebot/ { print; }' < logfile
Find the number of requests on for each unique request
awk 'BEGIN {FS="\t"} { print $4; }' < logfile | sort | uniq -c | sort -n
And of course lots of combinations to find specific visitors.
Any programming language that allows you to open and read files, do string/text manipulations can be used, eg Perl,Python,(g)awk, Ruby,PHP, even Java etc. They support modules for the file formats you are parsing,eg csv, etc.