views:

250

answers:

4

I am trying to use the PHP recursive directory iterator to find php and html files in order to list those by date. The code used is as follows:

     $filelist = array();
 $iterator = new RecursiveDirectoryIterator( $this->_jrootpath );
 foreach(new RecursiveIteratorIterator($iterator) as $file) {
  if ( !$file->isDir() ) {
   if ( preg_match( "/.(html|htm|php)$/", $file->getFilename() ) ) {
    $filelist[$file->getPathname()] = $file->getMTime();
   }
  }
 }
 arsort($filelist);
 foreach ($filelist as $key => $val) {
  $resultoutput .= '   <tr>
          <td class="filepath">'.$key.'</td>
          <td class="filedate">'.date("d-m-Y H:i:s",$val).'</td>
         </tr>';
 }

The above works, but I need to restrict the iteration to certain folders. This script will start in a parent folder where I expect some particular subfolders like 'administrator', 'components', 'modules', 'plugins' and 'templates' (in other words the standard folders of a Joomla installation). I also expect a few files in the parent folder, like 'index.php'. These files and folders should be iterated (frankly, writing this I am not sure if the parent folder itself is being iterated by my code!). However, it is possible that the parent folder also contains other subfolders which should not be iterated. These could be subfolders for subdomains or addon domains, named for example 'mysub' and 'myaddon.com'. There are two reasons why I want to skip any folder except for the normal Joomla installation folders. One is that I don't need to know about files in other folders. Moreover, iterating all folders can take so much time and resources that I get 500 error pages.

My question is how I would best change my code in order to make sure that the right folders are excluded from and included in the iteration? I have seen iterator examples around the web checking the name of a directory, but that does not help me enough, because I only need to check the second level folder names (meaning only the highest level subfolder in a path like parent/secondlevel/noneedtocheck/file.php).

A: 

You could try using the RecursiveRegexIterator.

EDIT This is your example using the RecursiveRegexIterator:

$filelist = array();
$iterator = new RecursiveDirectoryIterator( $this->_jrootpath );
foreach(new RecursiveRegexIterator($iterator, '/\.(html|htm|php)$/') as $file) {
    if ( !$file->isDir() ) {
        $filelist[$file->getPathname()] = $file->getMTime();
    }
}
arsort($filelist);
foreach ($filelist as $key => $val) {
    $resultoutput .= '                      <tr>
                                                                            <td class="filepath">'.$key.'</td>
                                                                            <td class="filedate">'.date("d-m-Y H:i:s",$val).'</td>
                                                                    </tr>';
}

By the way, it looks like $resultoutput is not created yet. In that case, you should create it as an empty string somewhere...

EDIT2: Oops, this does not really answer your question. For some reason I thought it was only about the recursive stuff. This is just a more elegant solution to loop over the files, but it doesn't solve your problem, sorry. You should probably combine this with one of the other answers.

Franz
That might be the best solution but I'm really not good with either regex or experimenting with almost undocumented classes.. It seems the iterators are powerful tools but user comments with good examples are still missing in the manual.
E Wierda
You do have a regex in your example code. I guess that would probably work. Else I'd love to help you ;)
Franz
Thanks! That regex was the height of my achievement lol. I managed to get it work with the code I posted but I would like to know how to do it in a better way with a regexiterator - afterall I write code in good part to learn. I am puzzled how that would pick the right folder level to check but I suppose a good regex would deal with that.
E Wierda
I see. Could you post what you have with the RecursiveRegexIterator?
Franz
Added example code based on yours now. Was this supposed to work with subdirectories, too?
Franz
A: 
    $filelist = array();

// make an array of excludedirs $excludeDirs= array("/var/www/joomla/admin/");

    $iterator = new RecursiveDirectoryIterator( $this->_jrootpath );
    foreach(new RecursiveIteratorIterator($iterator,RecursiveIteratorIterator::CHILD_FIRST) as $file) {
    if($file->getPathname() == excludeDirs[0])
       continue;
    if ( !$file->isDir() ) {
                    if ( preg_match( "/.(html|htm|php)$/", $file->getFilename() ) ) {
                            $filelist[$file->getPathname()] = $file->getMTime();
                    }
            }
    }
    arsort($filelist);
    foreach ($filelist as $key => $val) {
            $resultoutput .= '<tr>
            <td class="filepath">'.$key.'</td>
           <td class="filedate">'.date("d-m-Y H:i:s",$val).'</td>
           </tr>';
    }
streetparade
I haven't tried this, but if I understand the iteration right, then the pathname (the complete path plus file name with extension) would need to be split in parts to find the first directory after _jrootpath, which would have to match one of the excludeDirs (actually I suppose those would be includeDirs). Perhaps then it would afterall be better to use the RecursiveRegexIterator.
E Wierda
A: 

I am going to try the below: by putting the iterator in a function I can use it repeatedly for known folders to be included in order to build the result array. Still I hope one day I will find a solution that will more elegantly use the PHP iterators without a function nested in my method.

 $filelist = array();
 $filelist[$this->_jrootpath.DS.'index.php'] = date("d-m-Y H:i:s",filemtime($this->_jrootpath.DS.'index.php'));
 $filelist[$this->_jrootpath.DS.'index2.php'] = date("d-m-Y H:i:s",filemtime($this->_jrootpath.DS.'index2.php'));
 $joomlafolders = array( $this->_jrootpath.DS.'administrator', $this->_jrootpath.DS.'cache', $this->_jrootpath.DS.'components', $this->_jrootpath.DS.'images', $this->_jrootpath.DS.'includes', $this->_jrootpath.DS.'language', $this->_jrootpath.DS.'libraries', $this->_jrootpath.DS.'logs', $this->_jrootpath.DS.'media', $this->_jrootpath.DS.'modules', $this->_jrootpath.DS.'plugins', $this->_jrootpath.DS.'templates', $this->_jrootpath.DS.'tmp', $this->_jrootpath.DS.'xmlrpc' );
 foreach ( $joomlafolders as $folder ) {
  $iterator = new RecursiveDirectoryIterator( $folder );
  foreach(new RecursiveIteratorIterator($iterator) as $file) {
   if ( !$file->isDir() ) {
    if ( preg_match( "/.(html|htm|php)$/", $file->getFilename() ) ) {
     $filelist[$file->getPathname()] = $file->getMTime();
    }
   }
  }
 }
 arsort($filelist);
 foreach ($filelist as $key => $val) {
  $resultoutput .= '   <tr>
          <td class="filepath">'.$key.'</td>
          <td class="filedate">'.date("d-m-Y H:i:s",$val).'</td>
         </tr>';
 }
E Wierda
+1  A: 

no no! YOu want something like:

<?php
class FilteredRecursiveDirectoryIterator extends FilterIterator
{   
    public function __construct($path)
    {
        parent::__construct(new RecursiveDirectoryIterator($path));
    }
    public function accept()
    {
        if($this->current()->isFile() and in_array($this->getExtention(), array('html', 'php')))
            return true;

        return false;
    }

    private function getExtention()
    {
        return end(explode('.', $this->current()->getFilename()));
    }
}

where the accept() method does all the filtering for you. I haven't tested this but I'm sure it good to go ;)

CpILL