views:

381

answers:

7

Whats the most efficient way of turning the filenames in a directory into an array of strings?

Thanks.

+1  A: 

glob will be your friend

wich
+2  A: 

Try using glob() - link to PHP docs

It acts the same way as your typical dir function. If you want everything from the current directory, use glob('*'), since glob supports wildcard matching. If you want to see, say, text files from another directory, use glob('another/directory/*.txt'). It's a powerful tool.

Matchu
+3  A: 

Hello!

I think the one way would probably be just reading out the directory with a loop and storing each element in your array:

$files = array ( );
$dirHandle = opendir('.');
while ( $currentFile = readdir($dirHandle) )
{
      if ( $currentFile == '.' or $currentFile == '..' )
      {
         continue;
      }
      $files[] = $currentFile;
}
closedir($dirHandle);

best regards, lamas

lamas
`if ( $currentFile[0] == '.' )` this will return *true* for hidden files or directories (eg. .htaccess).
Crozin
Ignores `.` and `..` **and** `.svn` **and** `.htaccess`...
Alix Axel
That's bad. Fixing it now
lamas
A: 
// path
$mydir = '.';

$files = array();
$dir = opendir($mydir);
while(($myfile = readdir($dir)) !== false)
{
    if($myfile != '.' && $myfile != '..' && is_file($myfile) )
    {
        $files[] = $myfile;
    }
}
closedir($dir);

print_r($files);
markb
Does this do anything different than `glob()` at all?
Matchu
@markb: `.` and `..` are directories so you could just do `is_file()`.
Alix Axel
@Alix great thanks
markb
A: 

Either by using glob():

$files = glob('/path/to/dir/*');

Or scandir():

$files = array_diff(scandir('/path/to/dir/'), array('.', '..'));

@Bill Karwin - scandir() without array_diff():

$files = scandir('.');
$result = array();

foreach ($files as $file)
{
    if (($file == '.') || ($file == '..'))
    {
        continue;
    }

    $result[] = $file;
}
Alix Axel
+1  A: 
// slightly modified from markb's answer
// poster said "filenames" not files and directories.
// glob has no GLOB_ONLYFILES
// path
$mydir = '.';

$files = array();
$dir = opendir($mydir);
while(($myfile = readdir($dir)) !== false)
{
    if( !is_dir($myfile) ) // is_dir will match . and ..
    {
        $files[] = $myfile;
    }
}
closedir($dir);

print_r($files);

With glob :

// path
$mydir = '.';

$files = array();
foreach(glob($mydir) as $file_or_dir) {
    if( !is_dir($myfile) ) // is_dir will match . and ..
    {
        $files[] = $myfile;
    }
}    
print_r($files);
Kagee
+7  A: 

The solution using the fewest lines of code isn't always the most efficient if by efficient you mean fastest.

I tested the solutions given by some other answers in this thread in my /usr/lib directory, which contains 394 files. I ran each test 1000 times.

edit: I ran new tests after reading @Alix's comments, and after @lamas's solution changed.

  • Time for @Kagee's solution: foreach(glob()) = 12.4 sec
  • Time for @Matchu's solution: glob() = 8.1 sec
  • Time for @Kagee's solution: foreach(glob(GLOB_NOSORT)) = 6.4 sec
  • Time for @Alix Axel's solution: scandir() = 6.5 sec
  • Time for @Alix Axel's solution: array_diff(scandir()) = 6.4 sec
  • Time for @Kagee's solution: readdir() = 5.3 sec
  • Time for @markb's solution: readdir() = 5.2 sec
  • Time for @Matchu's solution: glob(GLOB_NOSORT) = 2.2 sec
  • Time for @lamas's solution: readdir() = 1.2 sec

Below is the script I used to test, so you can try it yourself:

<?php

$n = 1000;

$start = microtime(true);
$files = array();
for ($i = 0; $i < $n; ++$i) {
    foreach(glob('*') as $file_or_dir) {
        if( !is_dir($file_or_dir) ) // is_dir will match . and ..
        {
            $files[] = $file_or_dir;
        }
    }    
}
$end = microtime(true);
echo "Time for @Kagee's solution: foreach(glob()) = " . ($end-$start) . "\n";

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $files = glob('*');
}
$end = microtime(true);
echo "Time for @Matchu's solution: glob() = " . ($end-$start) . "\n";

$start = microtime(true);
$files = array();
for ($i = 0; $i < $n; ++$i) {
    foreach(glob('*', GLOB_NOSORT) as $file_or_dir) {
        if( !is_dir($file_or_dir) ) // is_dir will match . and ..
        {
            $files[] = $file_or_dir;
        }
    }    
}
$end = microtime(true);
echo "Time for @Kagee's solution: foreach(glob(GLOB_NOSORT)) = " . ($end-$start) . "\n";

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $files = scandir('.');
    $result = array();
    foreach ($files as $file)
    {
        if (($file == '.') || ($file == '..'))
        {
            continue;
        }
        $result[] = $file;
    }
}
$end = microtime(true);
echo "Time for @Alix Axel's solution: scandir() = " . ($end-$start) . "\n";

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $files = array_diff(scandir('.'), array('.', '..'));
}
$end = microtime(true);
echo "Time for @Alix Axel's solution: array_diff(scandir()) = " . ($end-$start) . "\n";

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $files = array();
    $dir = opendir('.');
    while(($myfile = readdir($dir)) !== false)
    {
        if( !is_dir($myfile) )
        {
            $files[] = $myfile;
        }
    }
    closedir($dir);
}
$end = microtime(true);
echo "Time for @Kagee's solution: readdir() = " . ($end-$start) . "\n";

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $files = array();
    $dir = opendir('.');
    while(($myfile = readdir($dir)) !== false)
    {
        if( is_file($myfile) )
        {
            $files[] = $myfile;
        }
    }
    closedir($dir);
}
$end = microtime(true);
echo "Time for @markb's solution: readdir() = " . ($end-$start) . "\n";

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $files = glob('*', GLOB_NOSORT);
}
$end = microtime(true);
echo "Time for @Matchu's solution: glob(GLOB_NOSORT) = " . ($end-$start) . "\n";

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $files = array();
    $dir = opendir('.');
    while(($currentFile = readdir($dir)) !== false)
    {
        if ( $currentFile == '.' or $currentFile == '..' )
        {
            continue;
        }
        $files[] = $currentFile;
    }
    closedir($dir);
}
$end = microtime(true);
echo "Time for @lamas's solution: readdir() = " . ($end-$start) . "\n";
Bill Karwin
@Bill Karwin: `glob()` is slower because it has to look for patterns **and** sort files, `scandir()` also sorts files, while all the others don't. Also there are some substantial differences in markb and Kagee answers since they only check for files.
Alix Axel
@Bill Karwin: Might I also add that `glob()` accepts the `GLOB_NOSORT` making it faster.
Alix Axel
@Bill Karwin: One more thing, @lamas's solution ignores files that start with `.` (like `.htaccess`) so I don't think this benchmark is very fair.
Alix Axel
@Alix: Thanks for the tip about `GLOB_NOSORT`. I've posted my test code. Of course, different methods are useful given different requirements. My point is that the fewest lines of code doesn't necessarily give the best performance.
Bill Karwin
@Bill Karwin: No problem, `GLOB_NOSORT` seems to be significantly faster. Would you mind benchmarking the code I updated in my answer? I would like to know how that compares to @lamas's approach.
Alix Axel
@Alix: Okay, I have added it, but it's not any better.
Bill Karwin
@Bill: That's odd... Thanks! =)
Alix Axel