tags:

views:

79

answers:

4

I have a directory full of log files in the form

${name}.log.${year}{month}${day}

such that they look like this:

logs/
  production.log.20100314
  production.log.20100321
  production.log.20100328
  production.log.20100403
  production.log.20100410
  ...
  production.log.20100314
  production.log.old

I'd like to use a bash script to filter out all the logs older than x amount of month's and dump it into *.log.old

X=6  #months

LIST=*.log.*;
for file in LIST; do
  is_older = file_is_older_than_months( ${file}, ${X} );
  if is_older; then
    cat ${c} >> production.log.old;
    rm ${c};
  fi
done;

How can I get all the files older than x months? and... How can I avoid that *.log.old file is included in the LIST attribute?

Thank you
Stefan

A: 

Presumably as a log file, it won't have been modified since it was created?

Have you considered something like this...

find ./ -name "*.log.*" -mtime +60 -exec rm {} \;

to delete files that have not been modified for 60 days. If the files have been modified more recently then this is no good of course.

Paul Creasey
A: 

You'll have to compare the logfile date with the current date. Start with the year, multiply by 12 to get the difference in months. Do the same with months, and add them together. This gives you the age of the file in months (according to the file name).

For each filename, you can use an AWK filter to extract the year:

awk -F. '{ print substr($3,0,4) }'

You also need the current year:

date "+%Y"

To calculate the difference:

$(( current_year - file_year ))

Similarly for months.

sczizzo
I generally prefer to convert both everything to seconds since epoch (using date, just as you did here) rather than trying to write scripts which handle time using the years / months / days model; it's not so sticky when we aren't going to days, but there's still less room for errors when using the simpler representation.
Charles Duffy
Agreed. That would also reduce the amount of math done in the shell--always a good thing.
sczizzo
Or use a scripting language that can do all this instead of combining the different syntaxes of each of these tools in a bash script. (Perl comes to mind.)
reinierpost
A: 

assuming you have possibility of modifying the logs and the filename timestamp is the more accurate one. Here's an gawk script.

#!/bin/bash
awk 'BEGIN{
 months=6
 current=systime() #get current time in sec
 sec=months*30*86400 #months in sec
 output="old.production" #output file
}
{
 m=split(FILENAME,fn,".")
 yr=substr(fn[m],0,4)
 mth=substr(fn[m],5,2)
 day=substr(fn[m],7,2)
 t=mktime(yr" "mth" "day" 00 00 00")
 if ( (current-t) > sec){
     print "file: "FILENAME" more than "months" month"
     while( (getline line < FILENAME )>0 ){
       print line > output
     }
     close(FILENAME)
     cmd="rm \047"FILENAME"\047"
     print cmd
     #system(cmd) #uncomment to use
 }
}' production*
ghostdog74
+2  A: 

The following script expects GNU date to be installed. You can call it in the directory with your log files with the first parameter as the number of months.

#!/bin/sh

min_date=$(date -d "$1 months ago" "+%Y%m%d")

for log in *.log.*;do
        [ "${log%.log.old}"     "!=" "$log" ] && continue
        [ "${log%.*}.$min_date" "<"  "$log" ] && continue
        cat "$log" >> "${log%.*}.old"
        rm "$log"
done
mdom