Everything I read about better php coding practices keeps saying don't use require_once because of speed.
Why is this?
What is the proper/better way to do the same thing as require_once
(if it matters, i'm using php5)
Everything I read about better php coding practices keeps saying don't use require_once because of speed.
Why is this?
What is the proper/better way to do the same thing as require_once
(if it matters, i'm using php5)
A better way to do things is to use an object-oriented approach and use __autoload().
Require_once and include_once require that the system keeps a log of what's already been included/required and each time there's another include/require-_once statement, they have to check that log.
On a computational basis, I can see there's extra work and resources going into doing that, but enough to detriment the speed of the whole app? I really doubt it... Not unless you're on really old hardware.
If you're really concerned about it, the alternative is doing the work yourself in as lightweight a manner as you can get away with. For simple apps, just making sure you've only included it once should suffice but if you're still getting redefine errors, you could something like this:
if (!defined('MyIncludeName')) {
require('MyIncludeName');
define('MyIncludeName', 1);
}
It's not great and it'll junk up your code but it's light. Personally, I'll stick with the ..._once
statements.
Can you give us any links to these coding practices which say to avoid it? As far as I'm concerned, it's a complete non-issue. I haven't looked at the source code myself, but I'd imagine that the only difference between include
and include_once
is that include_once
adds that filename to an array and checks over the array each time. It'd be easy to keep that array sorted, so searching over it should be O(log n), and even a medium-largish application would only have a couple of dozen includes.
I think in PEAR documentation, there is a recommendation for require, require_once, include and include_once. I do follow that guideline. Your application would be more clear.
You test, using include, oli's alternative and __autoload(); and test it with something like APC installed. I doubt using constant will speeding things up.
My personal opinion is that the usage of require_once (or include_once) is bad practice because require_once checks for you if you already included that file and suppress errors of double included files resulting in fatal errors (like duplicate declaration of functions/classes/etc.).
You should know if you need to include a file.
The *_once()
functions stat every parent directory to ensure the file you're including isn't the same as one that's already been included. That's part of the reason for the slowdown.
I recommend using a tool like Siege for benchmarking. You can try all the suggested methodologies and compare response times.
More on require_once()
at Tech Your Universe.
Yes, it is slightly more expensive than plain ol' require(). I think the point is if you can keep your code organized enough to not douplicate includes, don't use the *_once() functions, as it will save you some cycles.
But using the _once() functions isn't going to kill your app. Basically, just don't use it as an excuse to not have to organize your includes. In some cases, using it is still unavoidable, and it's not a big deal.
The PEAR2 wiki lists good reasons for abandoning all the require/include directives in favor of autoloading, at least for library code. These tie you down to rigid directory structures when alternative packaging models like phar are on the horizon.
I got curious and checked out Adam Backstrom's link to Tech Your Universe. This article describes one of the reasons that require should be used instead of require_once. However, their claims didn't hold up to my analysis. I'd be interested in seeing where I may have misanalysed the solution. I used php 5.2.0 for comparisons.
I started out by creating 100 header files that used require_once to include another header file. Each of these files looked something like:
<?php
// /home/fbarnes/phpperf/hdr0.php
require_once "../phpperf/common_hdr.php";
?>
I created these using a quick bash hack:
for i in /home/fbarnes/phpperf/hdr{00..99}.php; do
echo "<?php
// $i" > $i
cat helper.php >> $i;
done
This way I could easily swap between using require_once and require when including the header files. I then created an app.php to load the one hundred files. This looked like:
<?php
// Load all of the php hdrs that were created previously
for($i=0; $i < 100; $i++)
{
require_once "/home/fbarnes/phpperf/hdr$i.php";
}
// Read the /proc file system to get some simple stats
$pid = getmypid();
$fp = fopen("/proc/$pid/stat", "r");
$line = fread($fp, 2048);
$array = split(" ", $line);
// write out the statistics; on RedHat 4.5 w/ kernel 2.6.9
// 14 is user jiffies; 15 is system jiffies
$cntr = 0;
foreach($array as $elem)
{
$cntr++;
echo "stat[$cntr]: $elem\n";
}
fclose($fp);
?>
I contrasted the require_once headers with require headers that used a header file looking like:
<?php
// /home/fbarnes/phpperf/h/hdr0.php
if(!defined('CommonHdr'))
{
require "../phpperf/common_hdr.php";
define('CommonHdr', 1);
}
?>
I didn't find much difference when running this with require vs. require_once. In fact my initial tests seemed to imply that require_once was slightly faster, but I don't necessarily believe that. I repeated the experiment with 10000 input files. Here I did see a consistent difference. I ran the test multiple times, the results are close but using require_once uses on average 30.8 user jiffies and 72.6 system jiffies; using require uses on average 39.4 user jiffies and 72.0 system jiffies. Therefore, it appears that the load is slightly lower using require_once. However, wall clock is slightly increased. The 10,000 require_once calls use 10.15 seconds to complete on average and 10,000 require calls use 9.84 seconds on average.
Next step is to look into these differences. I used strace to analyse the system calls that are being made.
Before opening a file from require_once the following system calls are made:
time(NULL) = 1223772434
lstat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/fbarnes", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/fbarnes/phpperf", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/fbarnes/phpperf/h", {st_mode=S_IFDIR|0755, st_size=270336, ...}) = 0
lstat64("/home/fbarnes/phpperf/h/hdr0.php", {st_mode=S_IFREG|0644, st_size=88, ...}) = 0
time(NULL) = 1223772434
open("/home/fbarnes/phpperf/h/hdr0.php", O_RDONLY) = 3
This contrasts with require:
time(NULL) = 1223772905
lstat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/fbarnes", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/fbarnes/phpperf", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/fbarnes/phpperf/h", {st_mode=S_IFDIR|0755, st_size=270336, ...}) = 0
lstat64("/home/fbarnes/phpperf/h/hdr0.php", {st_mode=S_IFREG|0644, st_size=146, ...}) = 0
time(NULL) = 1223772905
open("/home/fbarnes/phpperf/h/hdr0.php", O_RDONLY) = 3
Tech Your Universe implies that require_once should make more lstat64 calls. However, they both make the same number of lstat64 calls. Possibly, the difference is that I am not running APC to optimize the code above. However, the next thing I did was compare the output of strace for the entire runs:
[fbarnes@myhost phpperf]$ wc -l strace_1000r.out strace_1000ro.out
190709 strace_1000r.out
210707 strace_1000ro.out
401416 total
Effectively there are approximately two more system calls per header file when using require_once. One difference is that require_once has an additional call to the time() function:
[fbarnes@myhost phpperf]$ grep -c time strace_1000r.out strace_1000ro.out
strace_1000r.out:20009
strace_1000ro.out:30008
The other system call is getcwd():
[fbarnes@myhost phpperf]$ grep -c getcwd strace_1000r.out strace_1000ro.out
strace_1000r.out:5
strace_1000ro.out:10004
This is called because I decided to relative path referenced in the hdrXXX files. If I make this an absolute reference, then the only difference is the additional time(NULL) call made in the code:
[fbarnes@myhost phpperf]$ wc -l strace_1000r.out strace_1000ro.out
190705 strace_1000r.out
200705 strace_1000ro.out
391410 total
[fbarnes@myhost phpperf]$ grep -c time strace_1000r.out strace_1000ro.out
strace_1000r.out:20008
strace_1000ro.out:30008
This seems to imply that you could reduce the number of system calls by using absolute paths rather than relative paths. The only difference outside of that is the time(NULL) calls which appear to be used for instrumenting the code to compare what is faster.
One other note is that the APC optimization package has an option called "apc.include_once_override" that claims that it reduces the number of system calls made by the require_once and include_once calls (see the PHP docs).
Sorry for the long post. I got curious.
This thread makes me cringe, because there's already been a "solution posted", and it's, for all intents and purposes, wrong. Let's enumerate:
Defines are really expensive in PHP. You can look it up or test it yourself, but the only efficient way of defining a global constant in PHP is via an extension. (Class constants are actually pretty decent performance wise, but this is a moot point, because of 2)
If you are using require_once()
appropriately, that is, for inclusion of classes, you don't even need a define; just check if class_exists('Classname')
. If the file you are including contains code, i.e. you're using it in the procedural fashion, there is absolutely no reason that require_once()
should be necessary for you; each time you include the file you presume to be making a subroutine call.
So for a while, a lot of people did use the class_exists()
method for their inclusions. I don't like it because it's fugly, but they had good reason to: require_once()
was pretty inefficient before some of the more recent versions of PHP. But that's been fixed, and it is my contention that the extra bytecode you'd have to compile for the conditional, and the extra method call, would by far overweigh any internal hashtable check.
Now, an admission: this stuff is tough to test for, because it accounts for so little of the execution time.
Here is the question you should be thinking about: includes, as a general rule, are expensive in PHP, because every time the interpreter hits one it has to switch back into parse mode, generate the opcodes, and then jump back. If you have a 100+ includes, this will definitely have a performance impact. The reason why using or not using require_once is such an important question is because it makes life difficult for opcode caches. An explanation for this can be found here, but what this boils down to is that:
If during parse time, you know exactly what include files you will need for the entire life of the request, require()
those at the very beginning and the opcode cache will handle everything else for you.
If you are not running an opcode cache, you're in a hard place. Inlining all of your includes into one file (don't do this during development, only in production) can certainly help parse time, but it's a pain to do, and also, you need to know exactly what you'll be including during the request.
Autoload is very convenient, but slow, for the reason that the autoload logic has to be run every time an include is done. In practice, I've found that autoloading several specialized files for one request does not cause too much of a problem, but you should not be autoloading all of the files you will need.
If you have maybe 10 includes (this is a very back of the envelope calculation), all this wanking is not worth it: just optimize your database queries or something.
It has nothing to do with speed. It's about failing gracefully.
If require_once() fails, your script is done. Nothing else is processed. If you use include_once() the rest of your script will try to continue to render, so your users potentially would be none-the-wiser to something that has failed in your script.