tags:

views:

188

answers:

7

I'd like to store 0 to ~5000 IP addresses in a plain text file, with an unrelated header at the top. Something like this:

Unrelated data
Unrelated data
----SEPARATOR----
1.2.3.4
5.6.7.8
9.1.2.3

Now I'd like to find if '5.6.7.8' is in that text file using PHP. I've only ever loaded an entire file and processed it in memory, but I wondered if there was a more efficient way of searching a text file in PHP. I only need a true/false if it's there.

Could anyone shed any light? Or would I be stuck with loading in the whole file first?

Thanks in advance!

+1  A: 

I'm not sure if perl's command line tool needs to load the whole file to handle it, but you could do something similar to this:

<?php
...
$result = system("perl -p -i -e '5\.6\.7\.8' yourfile.txt");
if ($result)
    ....
else
    ....
...
?>

Another option would be to store the IP's in separate files based on the first or second group:

# 1.2.txt
1.2.3.4
1.2.3.5
1.2.3.6
...

# 5.6.txt
5.6.7.8
5.6.7.9
5.6.7.10
...

... etc.

That way you wouldn't necessarily have to worry about the files being so large you incur a performance penalty by loading the whole file into memory.

localshred
+1 For splitting up the files. That will reduce the costs.
Gumbo
+5  A: 

5000 isn't a lot of records. You could easily do this:

$addresses = explode("\n", file_get_contents('filename.txt'));

and search it manually and it'll be quick.

If you were storing a lot more I would suggest storing them in a database, which is designed for that kind of thing. But for 5000 I think the full load plus brute force search is fine.

Don't optimize a problem until you have a problem. There's no point needlessly overcomplicating your solution.

cletus
I agree that having lots of records is probably better handled by a database table indexed for searching on the IP column.
localshred
As you said, worked a treat and very fast! 'Don't optimize a problem until you have a problem' Sound advice, thank you :)
Al
As an alternative you can as well use [file('filename.txt')](http://php.net/manual/function.file.php)
slosd
A: 

You could shell out and grep for it.

Cody Caughlan
A: 

You might try fgets()

It reads a file line by line. I'm not sure how much more efficient this is though. I'm guessing that if the IP was towards the top of the file it would be more efficient and if the IP was towards the bottom it would be less efficient than just reading in the whole file.

Bart
A: 

You could use the GREP command with backticks in your on a Linux server. Something like:

$searchFor = '5.6.7.8';
$file      = '/path/to/file.txt';

$grepCmd   = `grep $searchFor $file`;
echo $grepCmd;
Phill Pafford
A: 

I haven't tested this personally, but there is a snippet of code in the PHP manual that is written for large file parsing:

http://www.php.net/manual/en/function.fgets.php#59393

//File to be opened
$file = "huge.file";
//Open file (DON'T USE a+ pointer will be wrong!)
$fp = fopen($file, 'r');
//Read 16meg chunks
$read = 16777216;
//\n Marker
$part = 0;

while(!feof($fp)) {
    $rbuf = fread($fp, $read);
    for($i=$read;$i > 0 || $n == chr(10);$i--) {
        $n=substr($rbuf, $i, 1);
        if($n == chr(10))break;
        //If we are at the end of the file, just grab the rest and stop loop
        elseif(feof($fp)) {
            $i = $read;
            $buf = substr($rbuf, 0, $i+1);
            break;
        }
    }
    //This is the buffer we want to do stuff with, maybe thow to a function?
    $buf = substr($rbuf, 0, $i+1);
    //Point marker back to last \n point
    $part = ftell($fp)-($read-($i+1));
    fseek($fp, $part);
}
fclose($fp);

The snippet was written by the original author: hackajar yahoo com

shedd
A: 

are you trying to compare the current IP with the text files listed IP's? the unrelated data wouldnt match anyway. so just use strpos on the on the full file contents (file_get_contents).

<?php
    $file = file_get_contents('data.txt');
    $pos = strpos($file, $_SERVER['REMOTE_ADDR']);
    if($pos === false) {
        echo "no match for $_SERVER[REMOTE_ADDR]";
    }
    else {
        echo "match for $_SERVER[REMOTE_ADDR]!";
    }
?>
tann98