tags:

views:

57

answers:

4

I'm trying to read through a file, find a certain pattern and then grabbing a set number of lines of text after the line that contains that pattern. Not really sure how to approach this.

+1  A: 

First parse the file into lines. Open, read, split on the line break

lines = File.open(file_name).read.split("\n")

Then get index

index = line.index{|x| x.match(/regex_pattern/)}

Where regex_pattern is the pattern that you are looking for. Use the index as a starting point and then the second argument is the number of lines (in this case 5)

lines[index, 5]

It will return an array of 'lines'

You could combine it a bit more to reduce the number of lines. but I was attempting to keep it readable.

Geoff Lanotte
Since this opens up the entire file at once would it have memory problems at all with bigger documents? (up to 5mb or so)
Randuin
not so much, I use similar methods to parse log files of 500MB.
Geoff Lanotte
Sucking the entire file into memory is poor style, even though VM enables it.
Larry K
What would be the correct style to maintain functionality but in "good style"?
Randuin
lars has the answer, give it to him.
Geoff Lanotte
A: 
matched = false;
num = 0;
res = "";

new File(filename).each_line { |line|
    if (matched) {
        res += line+"\n";
        num++;
        if (num == num_lines_desired) {
            break;
        }
    } elsif (line.match(/regex/)) {
        matched = true;
    }

}

This has the advantage of not needing to read the whole file in the event of a match.

When done, res will hold the desired lines.

Borealid
Holy moly. This has to be hands down the ugliest Ruby code I've ever seen. Well, actually, given the amount of syntax errors, I'd hardly even call it Ruby code.
Jörg W Mittag
It's CRuby! Hey, Ruby supports block syntax. You don't *have* to use "begin" and "end" :-P
Borealid
+1  A: 

If you're not tied to Ruby, grep -A 12 trivet will show the 12 lines after any line with trivet in it. Any regex will work in place of "trivet"

Slartibartfast
Well I actually need to still parse the following whatever lines. And extract information out of them. So grep wasn't a good fit.
Randuin
+1  A: 

If you want the n number of lines after the line matching pattern in the file filename:

lines = File.open(filename) do |file|
  line = file.readline until line =~ /pattern/ || file.eof;
  file.eof ? nil : (1..n).map { file.eof ? nil : file.readline }.compact
end

This should handle all cases, like the pattern not present in the file (returns nil) or there being less than n lines after the matching lines (the resulting array containing the last lines of the file.)

Lars Haugseth
+1 - nicely done
Geoff Lanotte