ansaurus

Question

Tool or language to count occurrances of errors in a log file

Answer 1

+4 A:

Perl would be my first choice for the string parsing. Using a RegEx you could parse through that log file in no time. From what I can see it looks like you're dealing with a nicely computer readable file. You could use a Perl hash to do your averaging.

You could likely do the same thing with C# and their RegExs if you're more familiar with that, but Perl was built to do stuff like this.

Joe Basirico 2009-01-23 18:49:45

How can I group and count all of the same errors? I understand the regex will give me count of all matching items, but I need them grouped without knowing what the full error text may be. I can match on "ERROR |" but that is too broad, and matching on a specific error may cause a miss for a new one

Tai Squared 2009-01-23 19:02:06

Answer 2

+1 A:

I would use a RegEx and count the number of occurrences. You can do this in a myriad of languages, even a simple shell script would do it, e.g.

grep -E ".*ERROR.*\n" logfile | wc -l

Spikolynn 2009-01-23 18:58:40

This would only give a count of all errors, not a count of each error by type which is what I need.

Tai Squared 2009-01-23 19:04:20

Missed that, sorry. For that I would write a C# program to open the file, get all matches of a regex ".*ERROR.*\n", then regex.Split("ERROR") and use a HashTable to count the occurences of errors.

Spikolynn 2009-01-23 19:13:33

Answer 3

+1 A:

If you know/like .NET, the Push LINQ framework that Marc Gravell and I developed would be an ideal candidate for this. Basically, you set up all the aggregation you want (grouping, summing etc) beforehand, and "push" the logfile through it, then ask for the results at the end. This would let you do everything with near-constant memory consumption and a single pass through the data.

Let me know if you want more details.

Jon Skeet 2009-01-23 18:59:07

Answer 4

+1 A:

Here's a unix (or Cygwin) command-line way do this with:

An AWK command (to parse out the 4th field, where your fields are separated by pipes "|")
A SED command to replace the transaction # ([584]) above to make grouping easier (with [tid])
sort and uniq to find and count duplicate lines:

Here is the command-line:

awk "FS=\"^|\";{print $4}" logfile.txt | sed -e "s/\[[0-9]*\]/[tid]/g" \
| sort | uniq -c | sort

Here's the output:

   1  Biz.Dee: Logout Transaction [id] executed in [id] milliseconds
   1  Foo.Bar: Backend error
   1  Foo.Bar: InvalidUserException
   1  Foo.Com: Timeout error
   3  Foo.Bar: Login Transaction [id] executed in [id] milliseconds

NicJ 2009-01-23 19:02:01

Answer 5

+2 A:

Here's a possible Perl starting point for you:

#! /usr/bin/perl
use strict;
use warnings;

my %unique_messages;
while (<>)
{
  my ($timestamp, $thread, $type, $message) = $_ =~
    /^
      ([^|]+) \|
      ([^|]+) \|
      ([^|]+) \|
      (.+)
     $/x;

  $unique_messages{$message}++ if $type =~ /ERROR/;
}

print $unique_messages{$_}, ' -> ', $_, "\n" for keys %unique_messages;
exit 0;

Produces:

% ec.pl < err.log
1 ->  Foo.Com: Timeout error
1 ->  Foo.Bar: InvalidUserException
2 ->  Foo.Bar: Backend error

Paul Beckingham 2009-01-23 19:04:42

Answer 6

A:

Another possibility using awk:

grep ERROR filename.log | awk -F'|' '{ print $4 }' | awk -FS=':' '{count[$1]++}END{for(j in count) print j,": "count[j]" occurence(s)"}'

armandino 2009-01-23 19:26:50

Answer 7

A:

You can use a program like monarch to give structure to flat data. I have used it to take text files and make tables out of them that I can use in a database.

Jas Panesar 2009-01-23 19:28:13

Answer 8

A:

microsoft log parser if you are ok with SQL. And using Windows. Free and quite handy. Easy to wrap in an HTA, then you can use VBS or (?) JS to build query strings interactively. Believe it will do subtotals for you. Certainly sorts and groups.

2009-02-15 05:05:40

Answer 9

A:

In vim you can do :%s/pattern//n where pattern is the search string.

Tim Matthews 2009-02-15 05:11:06

ansaurus

tags:

views:

answers:

Tool or language to count occurrances of errors in a log file

related questions