views:

627

answers:

10

As our PHP5 OO application grew (in both size and traffic), we decided to revisit the __autoload() strategy.

We always name the file by the class definition it contains, so class Customer would be contained within Customer.php. We used to list the directories in which a file can potentially exist, until the right .php file was found.

This is quite inefficient, because you're potentially going through a number of directories which you don't need to, and doing so on every request (thus, making loads of stat() calls).

Solutions that come to my mind...

-use a naming convention that dictates the directory name (similar to PEAR). Disadvantages: doesn't scale too great, resulting in horrible class names.

-come up with some kind of pre-built array of the locations (propel does this for its __autoload). Disadvantage: requires a rebuild before any deploy of new code.

-build the array "on the fly" and cache it. This seems to be the best solution, as it allows for any class names and directory structure you want, and is fully flexible in that new files just get added to the list. The concerns are: where to store it and what about deleted/moved files. For storage we chose APC, as it doesn't have the disk I/O overhead. With regards to file deletes, it doesn't matter, as you probably don't wanna require them anywhere anyway. As to moves... that's unresolved (we ignore it as historically it didn't happen very often for us).

Any other solutions?

A: 

CodeIgniter does something similar with the load_class function. If I recall correctly, it is a static function that holds an array of objects. The call to the function is:


 load_class($class_name, $instansiate);

so in your case


 load_class('Customer', TRUE);

and this would load an instance of the Customer class into the objects array.

The function was pretty straight forward. Sorry I cant remember the name of the class it was in. But I do recall there being several classes that get loaded such as I think the Routing class, the Benchmark class and the URI class.

A: 

Have you investigated using Zend_Loader (with registerAutoload()) instead of just the native __autoload()? I've used Zend_Loader and have been happy with it, but haven't looked at it from a performance perspective. However, this blog post seems to have done some performance analysis on it; you could compare those results to your in-house performance testing on your current autoloader to see if it can meet your expectations.

Rob Hruska
+2  A: 

I've also been playing with autoload for quite some time, and I ended up implementing some sort of namespaced autoloader (yes, It works also for PHP5.2).

The strategy is quite simple: First I have a singleton class (loader) which has a call that simulates import. This call takes one parameter (the full class name to load) and internally calculates the file name it was called from (using debug_backtrace()). The call stores this information in an associative array to use it later (using the calling file as the key, and a list of imported classes for each key).

Typical code looks like this:

<?php

    loader::import('foo::bar::SomeClass');
    loader::import('foo::bar::OtherClass');

    $sc = new SomeClass();

?>

When autoload is fired, the full class name that was stored in the array gets transformed to a real filesystem location (double colons are replaced with directory separators), and the resulting filename is included.

I know that It's not exactly what you were asking for, but it may solve the directory traversal issue, as the loader directly knows where the file exactly is (with the added feature that you could have your classes organized in directories, with no apparent performance penalty).

I could provide you some working examples, but I'm too shy to show my crappy code to the public. Hope the above explanation was useful...

azkotoki
Two big advantages of autoload are 1) not having to manually load classes before using them and 2) not hard coding their locations in the file system. You've undone both.
Preston
You may be right if your project uses some few classes, but for large OO projects you'll have to move to another solution.Imagine php searching for a single class through all the directories of a huge framework. Too bad without any sort of import hints.
azkotoki
A: 

If you're using APC, you should enable opcode caching. I believe this would have more benefit to performance, and be a more transparent solution, than any class/file strategy you employ.

The second suggestion is to use whatever class/file strategy is easiest to develop against, but on your production site, you should concatenate the classes used most frequently into one file and make sure that's loaded during every request (or cached with APC).

Do some performance experiments with increasing sets of your classes. You will probably find that the benefit of reducing file I/O is so significant, that stuffing all the classes together into one file is a net win, even if you don't need all the classes during every request.

Bill Karwin
Well of course I use opcode caching. It's just that APC doesn't prevent you to call stat() multiple times, as it intervenes only upon require().
tpk
As to the second point, given the amount of classes, the benefit is cancelled by the fact that we're loading way more than neccessary. Also, it's really inconvenient to rebuild it every time you want to deploy something.
tpk
Have you measured the performance and compared the results, or are you just making an educated guess?
Bill Karwin
+1  A: 

There is 2 general approaches that work well.
First is using PEAR standard class naming structure so you just need to replace '_' with / to find the class.

http://pear.php.net/manual/en/pear2cs.rules.php

Or you can search directories for php classes and map class name to file. You can save the class map to cache to save searching directories every page load.
Symfony framework uses those approach.

Generally its better to follow the standard structure as its simpler and you don't need to cache anything, plus you are then following recommended guidelines.

MOdMac
A: 
function __autoload( $class )
{
    $patterns = array( '%s.class.php', '%s.interface.php' );

    foreach( explode( ';', ini_get( 'include_path' ) ) as $dir )
    {
        foreach( $patterns as $pattern )
        {
            $file    = sprintf( $pattern, $class );
            $command = sprintf( 'find -L %s -name "%s" -print', $dir, $file );
            $output  = array();
            $result  = -1;

            exec( $command, $output, $result );

            if ( count( $output ) == 1 )
            {
                require_once( $output[ 0 ] );
                return;
            }
        }
    }

    if ( is_integer( strpos( $class, 'Exception' ) ) )
    {
        eval( sprintf( 'class %s extends Exception {}', $class ) );
        return;
    }

    if ( ! class_exists( $class, false ) )
    {
     // no exceptions in autoload :(
        die( sprintf( 'Failure to autoload class: "%s"', $class ) );
        // or perhaps: die ( '<pre>'.var_export( debug_backtrace(), true ).'</pre>' );        
    }
}

You could also find some other, less posix dependent way to iterate directories, but this is what i've been using.

It traverses all the directories in the include_path (set in php.ini or .htaccess) to find a class or interface.

Kris
OMG, that's horrible. You call `find` for every dir in the INCLUDE_PATH, for every class you load, on every PHP request?!?
Bill Karwin
And your path separator is ';' (which means this code runs only on Windows) but you rely on a posix tool like `find`?
Bill Karwin
I concur... this is horribly inefficient. Why not file_exists()? That's what we have at the moment.
tpk
Check out RecursiveDirectoryIterator.
Preston
Sorry Bill, you are wrong on both accounts. I do not run on windows but I do cache the location of files in a database, this only gets used when they cannot be found there. Remember it is just a sample smart people...
Kris
Preston: i have, in combination with RecursiveIteratorIterator it gets the job done in less code and almost as fast (ymmv)
Kris
tpk: file_exists can only take the path to a file, you cannot use it a deeper nested situation.
Kris
I sort of like the way you dynamically create exceptions. Then again it's kinda eval..erm evil. Sorta bittersweet
André Hoffmann
In addition to failings the others have pointed out, 1) Exceptions need to be defined somewhere so that others can look them up and see the comment which explains under what circumstances they will be thrown. 2) Die()ing when the class doesn't exist provides *less* useful information than just letting the calling code crash naturally.
too much php
@too much php, if you read before you post you see that Exceptions are loaded, only generated if there is no definition. That way you can throw a WhateverTheHeckYouNeedException without implementing a definition, most Exceptions you throw are going to be only extend Exception by name anyway.
Kris
+1  A: 

We use something much like the last option, except with a file_exists() check before the require. If it doesn't exist, rebuild the cache and try once more. You get the extra stat per file, but it handles moves transparently. Very handy for rapid development where I move or rename things frequently.

Preston
+1  A: 

I have used this solution in the past, I blogged about it for reference and may be interesting to some of you...

Here it is

Orange Box
A: 

I have specific naming conventions for each 'type' of class (controllers, models, library files, and so on...), so currently I do something similar to:

function __autoload($class){
    if($class matches pattern_1 and file_exists($class.pattern_1)){
        //include the file somehow
    } elseif($class matches pattern_2 and file_exists($class.pattern_2)){
        //include the file somehow
    } elseif(file_exists($class.pattern_3)){
        //include the file somehow
    } else {
       //throw an error because that class does not exist?
    }
}
SeanJA
A: 

Old thread but I thought I could expose my method here anyway, maybe it could help someone. This is the way I define __autoload() in my website entry point /path/to/root/www/index.php for example :

function __autoload($call) {
    require('../php/'.implode('/', explode('___', $call)).'.php');
}

All PHP files are in organized in a tree

/path/to/root/php
  /Applications
    /Website
      Server.php
  /Model
    User.php
  /Libraries
    /HTTP
      Client.php
    Socket.php

And classes name are :

Applications___Website___Server
Model___User
Libraries___HTTP___Client
Libraries___Socket

It is fast and if the file is not present, then it will crash and your error log will tell you which file is missing. It may seem a bit harsh but if you try to use the wrong class, it is your problem.

NB : it was for PHP 5 < 5.3, so for PHP 5.3 you may use namespaces, the reason why I used 3 _ as separator is that it is an easy replacement to do for 5.3 namespace use

Serty Oan