views:

112

answers:

7

I have a 200kb file, what I use in multiple pages, but on each page I need only 1-2 lines of that file so how I can read only these lines what I need if I know the line number?

For example if I need only the 10th line, I don`t want to load in memory all the lines, just the 10th line.

Sorry for my bad english!

A: 

Just loop through them without storing, e.g.

$i = 1;
$file = fopen('file.txt', 'r');
while (!feof($file)) {
   $line = fgets($file); // this gets whole line from the file;
   if ($i == 10) {
       break; // break on tenth line
   } 
   $i ++;
}

The above example would keep memory for only the last line it got from the file, so this is the most memory efficient way to do it.

bisko
1. you forget $i++, 2. why not just check if $i == 10?
zerkms
Bleh, I always forget to put the increments. As for the == 10 ... again, a bad habbit of parsing too much stuff around with repetitions.. really sorry, fixed :)
bisko
stream_get_line() is faster than fgets()
Ivo Sabev
@Ivo: can you measure this difference? btw, C++ code will be faster, than php - so we need to rewrite this in C++?
zerkms
10,000 lines file fgets() - 27 seconds, stream_get_line() - 0.5 seconds. You can use assembler if you want.
Ivo Sabev
@Ivo, check your hard drive, please. 10,000 lines with fgets = ~0.000327, while stream_get_line goes by ~0.0000532. So this confirms it's faster. Not sure why, though.
bisko
@bisko It might be from the version as stream_get_line is faster in new versions of PHP 5+
Ivo Sabev
@brisko I checked inside the PHP source code. fgets() is defined in file.c and stream_get_line is defined in streamsfuncs.c You can read their source code and see the fgets() actually calls stream_get_line with couple of argument checks before that and some result improvements at the back, which make fgets() a bit slower. This is version 5.3.2
Ivo Sabev
@Ivo, that's what I said. What puzzles me is that it took 27 seconds for fgets.
bisko
A: 

use fgets(). 10 times :-) in this case you will not store all 10 lines in the memory

zerkms
+2  A: 

Unless you know the offset of the line, you will need to read every line up to that point. You can just throw away the old lines (that you don't want) by looping through the file with something like fgets(). (EDIT: Rather than fgets(), I would suggest @Gordon's solution)

Possibly a better solution would be to use a database, as the database engine will do the grunt work of storing the strings and allow you to (very efficiently) get a certain "line" (It wouldn't be a line but a record with an numeric ID, however it amounts to the same thing) without having to read the records before it.

Yacoby
Please comment on the downvote
Yacoby
That Database will be faster is subjective. If the information he is trying to access is in the beginning of the file it will be a lot more faster. Reading from a database is still reading from a file. He will get improvement from the database index only if he is looking for something away from the beginning of his file. It also depends on what he is trying to achieve exactly.
Ivo Sabev
He never said the database would be faster. Only that it would be better. The OP's concern could be seen as an issue of memory rather than speed.
webbiedave
@Ivo As @webbiedave said, I never mentioned faster. I was trying to add in the suggestion that there are alternatives that *may* be a better solution to the problem rather than the first solution I suggested.
Yacoby
@Yacoby I didn't read carefully.
Ivo Sabev
A: 
<?php
    $lines = array(1, 2, 10);

    $handle = @fopen("/tmp/inputfile.txt", "r");
    if ($handle) {
        $i = 0;
        while (!feof($handle)) { 
            $line = stream_get_line($handle, 1000000, "\n");

            if (in_array($i, $lines)) {
                echo $line;
                            $line = ''; // Don't forget to clean the buffer!
            }

            if ($i > end($lines)) {
                break;
            }

            $i++;
        } 
        fclose($handle);
    }
?>
Ivo Sabev
+7  A: 

Try SplFileObject

echo memory_get_usage(), PHP_EOL;        // 333200

$file = new SplFileObject('bible.txt');  // 996kb
$file->seek(5000);                       // jump to line 5000 (zero-based)
echo $file->current(), PHP_EOL;          // output current line 

echo memory_get_usage(), PHP_EOL;        // 342984 vs 3319864 when using file()

For outputting the current line, you can either use current() or just echo $file. I find it clearer to use the method though. You can also use fgets(), but that would get the next line.

Of course, you only need the middle three lines. I've added the memory_get_usage calls just to prove this approach does eat almost no memory.

Gordon
Nice. I didn't notice that `seek` was line rather than byte based.
Yacoby
+1 I prefer this code because it's just less work for the programmer, and it's clearer what's happening (seeking to certain line) than `fgets`.
notJim
@Yacoby there is `SplFileInfo::fseek()` and `SplFileInfo::seek()`. The latter is line based, the other is byte based. `seek()` is a method from the `SeekableIterator` interface.
Gordon
Even faster :)))
Ivo Sabev
Note that the line number being `seek`-ed to is not line 5,000. The `$line_pos` parameter is zero-based so the example seeks to line number 5,001 as it would be seen in a text editor, etc..
salathe
Thank`s this is really helpful !
coolboycsaba
A: 

Why are you only trying to load the first ten lines? Do you know that loading all those lines is in fact a problem?

If you haven't measured, then you don't know that it's a problem. Don't waste your time optimizing for non-problems. Chances are that any performance change you'll have in not loading the entire 200K file will be imperceptible, unless you know for a fact that loading that file is indeed a bottleneck.

Andy Lester
A: 

Do the contents of the file change? If it's static, or relatively static, you can build a list of offsets where you want to read your data. For instance, if the file changes once a year, but you read it hundreds of times a day, then you can pre-compute the offsets of the lines you want and jump to them directly like this:

 $offsets = array();
 while ($line = fread($filehandle)) { .... find line 10 .... }
 $offsets[10] = ftell($filehandle); // store line 10's location
 .... find next line
 $offsets[20] = ftell($filehandle);

and so on. Afterwards, you can trivially jump to that line's location like this:

 $fh = fopen('file.txt', 'rb');
 fseek($fh, $offsets[20]); // jump to line 20

But this could entirely be overkill. Try benchmarking the operations - compare how long it takes to do an oldfashioned "read 20 lines" versus precompute/jump.

Marc B