tags:

views:

1187

answers:

1

What is it about fgets() implementation that makes it so horrible on large files vs fread?

To demonstrate, run this code:

<?php
$line = str_repeat('HelloWorld', 100000) . "\n";
for($i=0; $i<10000; ++$i)
    file_put_contents('myfile', $line, FILE_APPEND);
//now we have roughly a 1gig file

// We'll even let fread go first incase
// the subsequent test would have any caching benefits
$fp = fopen('myfile2','r');
$start = microtime(true);
while (fread($fp,4096));
$end = microtime(true);
print ($end-$start) . " using fread\n";
fseek($fp, 0);

$start = microtime(true);
while (fgets($fp));
$end = microtime(true);
print ($end-$start) . " using fgets\n";
?>
+4  A: 

It might have something to do with how you're calling fgets. From the manual, it says that if you leave the second parameter out, then:

If no length is specified, it will keep reading from the stream until it reaches the end of the line.

If the data your working with has some very long lines (eg: "HelloWorld" 100,000 times = 1,000,000 characters), then it has to read that whole thing.

Again, from the manual:

If the majority of the lines in the file are all larger than 8KB, it is more resource efficient for your script to specify the maximum line length.

nickf