views:

149

answers:

4

Hello all,

I have just found out that my script gives me a fatal error:

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 440 bytes) in C:\process_txt.php on line 109

That line is this:

$lines = count(file($path)) - 1;

So I think it is having difficulty loading the file into memeory and counting the number of lines, is there a more efficient way I can do this without having memory issues?

The text files that I need to count the number of lines for range from 2MB to 500MB. Maybe a Gig sometimes.

Thanks all for any help.

+2  A: 

If you're running this on a Linux/Unix host, the easiest solution would be to use exec() or similar to run the command wc -l $path. Just make sure you've sanitized $path first to be sure that it isn't something like "/path/to/file ; rm -rf /".

Dave Sherohman
I am on a windows machine! If I was, I think that would be the best solution!
Abs
that is a non portable solution.
ghostdog74
@ghostdog74: Why, yes, you're right. It is non-portable. That's why I explicitly acknowledged my suggestion's non-portability by prefacing it with the clause "If you're running this on a Linux/Unix host...".
Dave Sherohman
A: 

Yes.

Open the file, read it line by line and increment a counter for each line.

Carl Smotricz
+5  A: 

This will use less memory, since it doesn't load the whole file into memory:

$file="largefile.txt";
$linecount = 0;
$handle = fopen($file, "r");
while(!feof($handle)){
  $line = fgets($handle);
  $linecount++;
}

fclose($handle);

echo $linecount;

fgets loads a single line into memory (if the second argument $length is omitted it will keep reading from the stream until it reaches the end of the line, which is what we want). This is still unlikely to be as quick as using something other than PHP, if you care about wall time as well as memory usage.

The only danger with this is if any lines are particularly long (what if you encounter a 2GB file without line breaks?). In which case you're better off doing slurping it in in chunks, and counting end-of-line characters:

$file="largefile.txt";
$linecount = 0;
$handle = fopen($file, "r");
while(!feof($handle)){
  $line = fgets($handle, 4096);
  $linecount = $linecount + substr_count($line, PHP_EOL);
}

fclose($handle);

echo $linecount;
Dominic Rodger
Thanks for the explanation Dominic - that looks good. I had a feeling it had to be done line by line and not letting count of file load the whole thing into memory!
Abs
The only danger of this snippet are huge files without linebreaks as fgets will then try to suck up the whole file. It'd be safer to read 4kB chunks at a time and count line termination characters.
David Schmitt
@David - how does my edit look? I'm not 100% confident about `PHP_EOL` - does that look right?
Dominic Rodger
@Dominic - not perfect: you could have a unix-style file (`\n`) being parsed on a windows machine (`PHP_EOL == '\r\n'`)
nickf
@nickf - good point. How would you address it? How does `fgets` work?
Dominic Rodger
A: 

You have several options. The first is to increase the availble memory allowed, which is probably not the best way to do things given that you state the file can get very large. The other way is to use fgets to read the file line by line and increment a counter, which should not cause any memory issues at all as only the current line is in memory at any one time.

Yacoby