views:

171

answers:

5

Hi,

I have a function that detects all files started by a string and it returns an array filled with the correspondent files, but it is starting to get slow, because I have arround 20000 files in a particular directory. I need to optimize this function, but I just can't see how. This is the function:

function DetectPrefix ($filePath, $prefix)
{

    $dh  = opendir($filePath);
    while (false !== ($filename = readdir($dh)))   
    { 
     $posIni = strpos( $filename, $prefix);
     if ($posIni===0):       
         $files[] = $filename;      
     endif;
    }

    if (count($files)>0){        
     return $files;
    } else {
     return null;
    }

}

What more can I do?

Thanks

+11  A: 

http://php.net/glob

$files = glob('/file/path/prefix*');

Wikipedia breaks uploads up by the first couple letters of their filenames, so excelfile.xls would go in a directory like /uploads/e/x while textfile.txt would go in /uploads/t/e.

Not only does this reduce the number of files glob (or any other approach) has to sort through, but it avoids the maximum files in a directory issue others have mentioned.

ceejayoz
+3  A: 

You could use scandir() to list the files in the directory, instead of iterating through them one-by-one using readdir(). scandir() returns an array of the files.

However, it'd be better if you could change your file system organization - do you really need to store 20000+ files in a single directory?

Peter
+1  A: 

I'm not sure but probably DirectoryIterator is a bit faster. Also add caching so that list gets generated only when files are added or deleted.

raspi
A: 

You just need to compare the first length of prefix characters. So try this:

function DetectPrefix($filePath, $prefix) {
    $dh  = opendir($filePath);
    $len = strlen($prefix);
    $files = array();
    while (false !== ($filename = readdir($dh))) {
        if (substr($filename, 0, $len) === $prefix) {
            $files[] = $filename;
        }
    }
    if (count($files)) {
        return $files;
    } else {
        return null;
    }
}
Gumbo
That's true, but IMHO, this is more a matter of I/O, than of in-memory treatment.
H_I
+2  A: 

As the other answers mention, I'd look at glob(), scandir(), and/or the DirectoryIterator class, there is no need to recreate the wheel.

However watch out! check your operating system, but there may be a limit on the maximum number of files in a single directory. If this is the case and you just keep adding files in the same directory you will have some downtime, and some problems, when you reach the limit. This error will probably appear as a permissions or write failure and not an obvious "you can't write more files in a single directory" message.

KM