tags:

views:

136

answers:

4

I have file names that end in yyyymmdd, eg: myFile.20090601, myFile20090708 , etc

I want to grep for a pattern in all files from June 08 to July 07 of 2009, ie: 20090609 to 20090707

How can I do a regex in one go?

I tried:

grep 'myPattern' *20090(6(09|[1-3][0-9])|70[1-7])
+4  A: 
20090(6(09|[1-3][0-9])|70[1-7])$

or

20090(6(0[89]|[1-3][0-9])|70[1-7])$

depending on whether you meant 8th or 9th of July (your question seems contradictory there).

Svante
Actually, I meant how to I construct it so I can use it to grep for a a pattern in those files. As in I should be able to do:grep 'myPattern' *20090(6(09|[1-3][0-9])|70[1-7])$ But it returns an error when I do this.
Saobi
ls -1 myFile.* | grep '20090(6(0[89]|[1-3][0-9])|70[1-7])$' | xargs grep <pattern>
Lars Haugseth
Still does not work.
Saobi
*20090(6(09|[1-3][0-9])|70[1-7])$ is not a regex, and it won't match what youe want. Also, the shell doesn't expand regexes... and a regex won't match a date range...
Osama ALASSIRY
So what are my options?
Saobi
Scripting would do the job, or an extremely long one-liner...
Osama ALASSIRY
@Saobi: Try adding the -E option to the first grep.
Lars Haugseth
Lars suggestion works, but he missed -E to the grep. If it still doesn't work, there's something you are not telling us that is relevant. Lars, you ought to post an answer, not a comment.
Daniel
A: 

The range of valid dates is 06–30 for June and 01—07 for July. Because the ranges of days are dissimilar, we should use separate regexes for each month. These are

/2009 06 (09 | [12][0-9] | 30)/x

(Notice how the day ranges are divided into cases depending on the tens place, because there are different conditions on what is valid for the units place depending.)

And

/2009 07 0[1-7]/x

and then we can join them into

/(2009 06 (09 | [12][0-9] | 30)) | (2009 07 0[1-7])/x

and then factor out the common points (may not be the best for readabilty) and add the end-of-line assertion:

/2009 0 (6 (09 | [12][0-9] | 30)) | (7 0[1-7]) $/x
Cirno de Bergerac
And you need to manually do this for every date range...
Osama ALASSIRY
+1  A: 

I'd suggest a perl/python script (or any other scripting language) that takes 3 parameters:

  1. The pattern
  2. Start date as yyyymmdd
  3. End date as yyyymmdd

It would :

  1. decode start and end date.
  2. loop through the files in a folder
  3. decode any dates in the filename
  4. check if it's between the dates, and grep the pattern
Osama ALASSIRY
+2  A: 
grep 'myPattern' `ls | grep -E "20090(6(09|[1-3][0-9])|70[1-7])"`

This works roughly as follows. Take a list of files in the current directory (ls), filter that using the date regex (ls | grep ...), then perform a grep search using your pattern, on the list of files that is produced (grep 'myPattern' ...). The back-ticks surrounding the ls | grep ... executes that part of the command and substitutes in the output of that command into the surrounding command. So if it produced output like "file1 file2 file3", then it would result in a command like grep 'myPattern' file1 file2 file3.

_jameshales
You should change " into ', otherwise, depending on the shell, the wildcard might get expanded.
Daniel
I think that there should be no asterisk at the beginning of the grep -E parameter.
Svante