tags:

views:

631

answers:

3

I'm trying to create a script that will loop through files that have their filenames written in the following format: yyyymmdd.hh.filename.

The script is called with:

./loopscript.sh 20091026.00 23
./loopscript.sh 20091026.11 15
./loopscript.sh 20091026.09 20091027.17

The need is for the script to check each hour between those two given dates/hours.

e.g.

cat 20091026.00.filename |more
cat 20091026.01.filename |more
...
cat 20091026.23.filename |more
cat 20091027.01.filename |more
cat 20091027.02.filename |more
...

and so on.

any idea how to go about this? I don't have any difficulty with standard 0 - x loops. or simple for loops. Just not sure how to go about the above.

A: 

One possible solution: convert dates into standard Unix representation of "Seconds passed since the epoch" and loop, increasing this number by 3600 (number of seconds in an hour) each iteration. Example:

#!/bin/bash

# Parse your input to date and hour first, so you get:
date_from=20090911
hour_from=10
date_to=20091026
hour_to=01

i=`date --date="$date_from $hour_from:00:00" +%s`
j=`date --date="$date_to $hour_to:00:00" +%s`

while [[ $i < $j ]]; do
  date -d "1970-01-01 $i sec" "+%Y%m%d.%H"
  i=$[ $i +  3600 ]
done
Pavel Shved
Suggest there are 3600 seconds per hour - and increment by hours as requested.
Jonathan Leffler
+1  A: 

How about this:

#!/bin/bash

date1=$1
date2=$2

#verify dates
if ! date -d "$date1" 2>&1 > /dev/null ; 
    then echo "first date is invalid" ; exit 1
fi
if ! date -d "$date2" 2>&1 > /dev/null ; 
    then echo "second date is invalid" ; exit 1
fi

#set current and end date
current=$(date -d "$date1")
end=$(date -d "$date2")

#loop over all dates
while [ "$end" != "$current" ] 
do
    file=$(date -d "$current" +%Y%m%d.%H)
    cat $file."filename" | more
    current=$(date -d "$current +1 hours")
done

#and the end date
file=$(date -d "$current" +%Y%m%d.%H)
cat $file."filename" | more
Puppe
This is reasonable for dates but doesn't handle hours at all. Actually, it's a bit suspect for dates as well - if you give it the args 20091027 and 20091028, you get from 20091027.00 through 20091028.00 - I'd expect it to stop at either 1027.23 or 1028.23.
paxdiablo
It does handle hours if you call it with "20091027 00" "20091028 03", forgot to mention that.
Puppe
Aaah, that's better. Presumably that's date doing the grunt work. Not bad.
paxdiablo
You can use the return value of `date` instead of grepping for an error message: `if ! date -d "$date1" >/dev/null 2> then echo "first date is invalid"; exit 1; fi` and the x's aren't necessary in the `while` statement. Also useless use of `cat`
Dennis Williamson
Thanks for the tips, updated my code. But I don't see what useless use of cat you mean, the user explicitly wanted the script to "cat 20091026.00.filename |more"
Puppe
yea my mindless fault. I'm not cat'n the files - doing quite a bit more - just typed in a command and was talking with another guy about piping to more - lol wrote it too.Nice code man. I had started off with the other one - but it's darn good.
Chasester
A: 

To process each file between two given date/hours, you can use the following:

#!/usr/bin/bash
#set -x

usage() {
    echo 'Usage: loopscript.sh <from> <to>'
    echo '       <from> MUST be yyyymmdd.hh or empty, meaning 00000000.00'
    echo '       <to> can be shorter and is affected by <from>'
    echo '         e.g., 20091026.00       27.01 becomes'
    echo '               20091026.00 20091027.01'
    echo '         If empty, it is set to 99999999.99'
    echo 'Arguments were:'
    echo "   '${from}'"
    echo "   '${to}'"
}

# Check parameters.

from="00000000.00"
to="99999999.99"
if [[ ! -z "$1" ]] ; then
    from=$1
fi
if [[ ! -z "$2" ]] ; then
    to=$2
fi
## Insert this to default to rest-of-day when first argument
##    but no second argument. Basically just sets second
##    argument to 23 so it will be transformed to end-of-day.
#if [[ ! -z "$1"]] ; then
#    if [[ -z "$2"]] ; then
#        to=23
#    fi
#fi

if [[ ${#from} -ne 11 || ${#to} -gt 11 ]] ; then
    usage
    exit 1
fi

# Sneaky code to modify a short "to" based on the start of "from".
# ${#from} is the length of ${from}.
# $((${#from}-${#to})) is the length difference between ${from} and ${to}
# ${from:0:$((${#from}-${#to}))} is the start of ${from} long enough
#   to make ${to} the same length.
# ${from:0:$((${#from}-${#to}))}${to} is that with ${to} appended.
# Voila! Easy, no?

if [[ ${#to} -lt ${#from} ]] ; then
    to=${from:0:$((${#from}-${#to}))}${to}
fi

# Process all files, checking that they're inside the range.

echo "From ${from} to ${to}"
for file in [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].[0-9][0-9].* ; do
    if [[ ! ( ${file:0:11} < ${from} || ${file:0:11} > ${to} ) ]] ; then
        echo "   ${file}"
    fi
done

When you create the files 20091026.00.${RANDOM} through 20091028.23.${RANDOM} inclusive, this is a couple of sample runs:

pax> ./loopscript.sh 20091026.07 9
From 20091026.07 to 20091026.09
   20091026.07.21772
   20091026.08.31390
   20091026.09.9214
pax> ./loopscript.sh 20091027.21 28.02
From 20091027.21 to 20091028.02
   20091027.21.22582
   20091027.22.30063
   20091027.23.29437
   20091028.00.14744
   20091028.01.6827
   20091028.02.10366
pax> ./loopscript.sh 00000000.00 99999999.99 # or just leave off the parameters.
   20091026.00.25772
   20091026.01.25964
   20091026.02.21132
   20091026.03.3116
   20091026.04.6271
   20091026.05.14870
   20091026.06.28826
   : : :
   20091028.17.20089
   20091028.18.13816
   20091028.19.7650
   20091028.20.20927
   20091028.21.13248
   20091028.22.9125
   20091028.23.7870

As you can see, the first argument must be of the correct format yyyymmdd.hh. The second argument can be shorter since it inherits the start of the first argument to make it the correct length.

This only attempts to process files that exist (from ls) and of the correct format, not every date/hour within the range. This will be more efficient if you have sparse files (including at the start and the end of the range) since it doesn't need to check that the files exist.

By the way, this is the command that created the test files, if you're interested:

pax> for dt in 20091026 20091027 20091028 ; do
         for tm in 00 01 02 ... you get the idea ... 21 22 23 ; do
             touch $dt.$tm.$RANDOM
         done
     done

Please don't type that in verbatim and then complain that it created files like:

20091026.you.12345
20091028.idea.77

I only trimmed down the line so it fits in the code width. :-)

paxdiablo
wow I'm blown away with the answers. I'll get back to ya'll.
Chasester
if someone enters ./loopscript.sh 20091026.07 without a to value. it will run from 20091026.07 to 20091023.07 is there a way to default it to .23 ? I have this for the top part of the code -trying to set the current date as the st/end timesfrom="$(date +%Y)$(date +%m)$(date +%d)".00"to="$(date +%Y)$(date +%m)$(date +%d)".23" if [[ ! -z "$2" ]] ; then from=$2 fi if [[ ! -z "$3" ]] ; then to=$3 fi if [[ ${#from} -ne 11 || ${#to} -gt 11 ]] ; then usage exit 1 fi if [[ ${#to} -lt ${#from} ]] ; then to=${from:0:$((${#from}-${#to}))}${to} fi
Chasester
Instead of your first two tests, just do the third test on the positional parameters directly: `${#1}` followed by assignments like: `from=${1:-00000000.00}`. Creating the test files inner loop: `for tm in 0{0..24}; do touch $dt.${tm: -2}.$RANDOM; done` or without an inner loop: `touch $dt.0{0..9}.$RANDOM; touch $dt.{10..19}.$RANDOM; touch $dt.{20..23}.$RANDOM`
Dennis Williamson
@Chasester: FYI, you can do `date +%Y%m%d` all at once.
Dennis Williamson
oh yeah - good point - thanks!
Chasester
@Chasester, see my update for how to default to end of day if first argument supplied but not second.
paxdiablo
Will do - thank you sir.
Chasester