views:

47

answers:

3

I need to extract some information from a log file using a shell script (bash). A line from the log file usually looks like this:

2009-10-02 15:41:13,796| some information

Occasionally, such a line is followed by a few more lines giving details about the event. These additional lines do not have a specific format (in particular they don't start with a timestamp).

I know how to use grep to filter the file based on keywords and expressions. Basically what I'm having trouble with is that sometimes I need to look at specific intervals only. For example I need to look only at the events which happened during the last X minutes. I'm not experienced with shell scripting, but due to the complexity of the time format, this seems to be a rather difficult task for me. On the other hand, I can imagine that this is something not too unusual, so I'm wondering if there are some tools which can make this easier for me or if you can give me some hints on how to tackle this problem?

+1  A: 
gawk -F"[-: ]" 'BEGIN{
  fivemin = 60 * 60 * 5   #last 5 min
  now=systime()
  difference=now - fivemin
}
/^20/{
  yr=$1
  mth=$2
  day=$3
  hr=$4
  min=$5
  sec=$5
  t1=mktime(yr" "mth" "day" "hr" "min" "sec)
  if ( t1 >= difference) {
   print
  }
}' file
ghostdog74
@ghostdog74: This doesn't seem to handle the extra lines of information that the OP has in his log file. I like it, though.
Marc Reside
A: 

You might want to take a look at my Python program which extracts data from log files based on a range of times. The specification of dates is not yet implemented (it is designed to look at roughly the most recent 24 hours). The time format that it expects (e.g. Jan 14 04:10:13) looks a little different than what you want, but that could be adapted. I haven't tested it with non-timestamped lines, but it should print everything within the specified range of times.

This will give you some usage information:

timegrep.py --help 
Dennis Williamson
+1  A: 

Basically what I'm having trouble with is that sometimes I need to look at specific intervals only.

You could use date to convert the date signature for you with the %s parameter:

%s     seconds since 1970-01-01 00:00:00 UTC

With it we can make a small demonstration:

#!/bin/bash

timespan_seconds=300 # 5 minutes

time_specified=$(date +"%s" -d "2010-08-25 14:54:40")

let time_now=$(date +"%s")
let time_diff=($time_now - $timespan_seconds) 

if [ $time_specified -ge $time_diff ]; then
        echo "Time is within range"
fi

Note that this doesn't address future time.

gamen