ansaurus

Question

Count the number of lines in a file with Ruby, without reading entire file into memory

Answer 1

+4 A:

It doesn't matter what language you're using, you're going to have to read the whole file if the lines are of variable length. That's because the newlines could be anywhere and theres no way to know without reading the file (assuming it isn't cached, which generally speaking it isn't).

If you want to indicate progress, you have two realistic options. You can extrapolate progress based on assumed line length:

assumed lines in file = size of file / assumed line size
progress = lines processed / assumed lines in file * 100%

since you know the size of the file. Alternatively you can measure progress as:

progress = bytes processed / size of file * 100%

This should be sufficient.

cletus 2010-04-16 04:06:59

actually for my needs this is probably a better idea that counting the number of lines.

smnirven 2010-04-16 13:45:36

I'm assuming the original poster was ok with reading through the file, just not having the entire contents of it in memory.

Andrew Grimm 2010-04-22 23:18:58

Answer 2

+4 A:

If you are on unix environment, you can just let wc -l do the work. It will not load the whole file into memory; since it is optimized for streaming file and count word/line the performance is good enough rather then streaming the file yourself in ruby.

DJ 2010-04-16 04:53:56

WC is so fast that one probably won't need a progress counter.

Wayne Conrad 2010-04-16 05:36:24

Answer 3

+2 A:

Reading the file a line at a time:

count = File.foreach(filename).inject(0) {|c, line| c+1}

Will be slower than

count = %x{wc -l #{filename}}.split.first.to_i

glenn jackman 2010-04-16 10:33:38

ansaurus

tags:

views:

answers:

Count the number of lines in a file with Ruby, without reading entire file into memory

related questions