views:

356

answers:

7

Is there a way to take text like below (if it was already in an array or a file) and have it strip the lines with a specified date range?

For instance if i wanted every line from 2009-09-04 until 2009-09-09 to be pulled out (maybe this can be done with grep?) how would I go about doing so?

date,test,time,avail
2009-09-01,JS,0.119,99.90
2009-09-02,JS,0.154,99.89
2009-09-03,SWF,0.177,99.90
2009-09-04,SWF,0.177,99.90
2009-09-05,SWF,0.177,99.90
2009-09-06,SWF,0.177,99.90
2009-09-07,SWF,0.177,99.90
2009-09-08,SWF,0.177,99.90
2009-09-09,SWF,0.177,99.90
2009-09-10,SWF,0.177,99.90

Thanks!

+2  A: 

Hi,

(This solution is in PHP -- but you can probably do that directly from the command-line, I suppose, with somekind of grep or anything)

Considering your dates are in the YYYY-MM-DD format, and that they are at the beginning of each line, you just have to compare the lines alphabetically to compare the dates.

One solution would be to :

  • load the string
  • explode it by lines
  • remove the first line
  • iterate over the lines, keeping only those that interest you

For the first parts :

$str = <<<STR
date,test,time,avail
2009-09-01,JS,0.119,99.90
2009-09-02,JS,0.154,99.89
2009-09-03,SWF,0.177,99.90
2009-09-04,SWF,0.177,99.90
2009-09-05,SWF,0.177,99.90
2009-09-06,SWF,0.177,99.90
2009-09-07,SWF,0.177,99.90
2009-09-08,SWF,0.177,99.90
2009-09-09,SWF,0.177,99.90
2009-09-10,SWF,0.177,99.90
STR;
$lines = explode(PHP_EOL, $str);
unset($lines[0]); // first line is useless

And, to iterate over the lines, filtering in/out those you want / don't want, you could use a foreach loop... Or use the array_filter function, which exists just for this ;-)

For instance, you could use something like this :

$new_lines = array_filter($lines, 'my_filter');
var_dump($new_lines);

And your callback function would be :

function my_filter($line) {
    $min = '2009-09-04';
    $max = '2009-09-09';
    if ($line >= $min && $line <= $max) {
        return true;
    } else {
        return false;
    }
}

And, the result :

array
  4 => string '2009-09-04,SWF,0.177,99.90' (length=26)
  5 => string '2009-09-05,SWF,0.177,99.90' (length=26)
  6 => string '2009-09-06,SWF,0.177,99.90' (length=26)
  7 => string '2009-09-07,SWF,0.177,99.90' (length=26)
  8 => string '2009-09-08,SWF,0.177,99.90' (length=26)

Hope this helps ;-)


If your dates where not in the YYYY-MM-DD format, or not at the beginning of each line, you'd have to explode the lines, and use strtotime (or do some custom parsing, depending on the format), and, then, compare timestamps.

But, in your case... No need for all that ;-)

Pascal MARTIN
Awesome, just what I was looking for!
+3  A: 

Well, you could probably somehow make it work with grep, but sed is more suited for the task:

sort < file.csv | sed -ne /^2009-09-04/,/^2009-09-09/p
Inshallah
+4  A: 

Python

import csv
import datetime

start= datetime.datetime(2009,9,4)
end= datetime.datetime(2009,9,9)

source= csv.DictReader( open("someFile","rb") )
for row in source:
    dt = datetime.datetime.strptime(row['date'],"%Y-%m-%d")
    if start <= dt <= end:
        print row # depends on what "pulled out" means
S.Lott
+1  A: 

awk solution is similar to sed:

awk '/^2009-09-04/,/^2009-09-09/ {next} {print}' filename

Without hardcoding the dates:

awk -v start='^2009-09-04' -v stop='^2009-09-09' '
    $0 ~ start, $0 ~ stop {next}
    {print}
' date.data
glenn jackman
A: 

Using R

> d <- read.csv("http://dpaste.com/88980/plain/", sep=",", header=T)
> r1 <- rownames(d[d$date == "2009-09-04",])
> r2 <- rownames(d[d$date == "2009-09-09",])
> d[rownames(d) %in% r1:r2,]
        date test  time avail
4 2009-09-04  SWF 0.177  99.9
5 2009-09-05  SWF 0.177  99.9
6 2009-09-06  SWF 0.177  99.9
7 2009-09-07  SWF 0.177  99.9
8 2009-09-08  SWF 0.177  99.9
9 2009-09-09  SWF 0.177  99.9
>
rjuuser
Do you see an R tag? Do you see an accepted answer? Do you see how old this question and every answer but yours is?
Chris Lutz
A: 

Perl:

perl -F/,/ -ane '
    print if $F[0] ge "2009-09-04"
          && $F[0] le "2009-09-09"' filename
mobrule
+1  A: 

You can use perl's flip flop to extract a line range.

Geo