tags:

views:

173

answers:

5

A client is insisting that we store some vital and complex configuration data as php arrays while I want it to be stored in the database. He brought up the issue of efficiency/optimization, saying that file i/o will be much faster than database queries. I'm pretty sure I heard somewhere that file includes are actually slow in PHP.

Any stats/real info on this?

+3  A: 

Given that most people will include 10-20 files into their script for a regular page, I have a feeling that includes are much faster than MySQL queries.

I could though, be wrong.

The question is that if those values will never change without you doing other modifications (moving files, etc), it should probably be stored in an include file.

If the data is dynamic in any way, it should be pulled from a database.

Chacha102
+6  A: 

It's gonna vary heavily based on your specific case.

If the database is stored in memory and/or the data you're looking for is cached, then database I/O should be pretty fast. A really complex query on a large database can take a fair bit of time if it's not cached or it has to go to disk, though.

File I/O does have to read from the disk, which is slow, though there are also smart caching mechanisms for keeping often-accessed files in memory as well.

Profiling on your actual system is gonna be the most definitive.

Gabriel Hurley
The data which will be pulled would be 30-50 rows and not a complex query, just a select * from. Any thoughts in that case?
Click Upvote
+1 for mentioning the "based on your specific case"
MitMaro
And +1 for the profiling suggestion. There are lots of variables--is the DB in memory? Do you use a bytecode cache like APC? Is the DB local?
Otterfan
+3  A: 

I don't think that performance is a compelling argument either way. On my Mac, I ran the following tests.

First 10,000 includes of a file that doesn't do anything but set a variable:

<?php

$mtime = microtime(); 
$mtime = explode(' ', $mtime); 
$mtime = $mtime[1] + $mtime[0]; 
$starttime = $mtime; 

for ($i = 0; $i < 10000; $i++) {
 include("foo.php");
}


$mtime = microtime(); 
$mtime = explode(" ", $mtime); 
$mtime = $mtime[1] + $mtime[0]; 
$endtime = $mtime; 
$totaltime = ($endtime - $starttime); 
echo 'Rendered in ' .$totaltime. ' seconds.'; 
?>

It took about .58 seconds to run each time. (Remember, that's 10,000 includes.)

Then I wrote another script that queries the database 10,000 times. It doesn't select any real data, just does a SELECT NOW().

<?php
mysql_connect('127.0.0.1', 'root', '');
mysql_select_db('test');

$mtime = microtime(); 
$mtime = explode(' ', $mtime); 
$mtime = $mtime[1] + $mtime[0]; 
$starttime = $mtime; 

for ($i = 0; $i < 10000; $i++) {
 mysql_query("select now()");
}


$mtime = microtime(); 
$mtime = explode(" ", $mtime); 
$mtime = $mtime[1] + $mtime[0]; 
$endtime = $mtime; 
$totaltime = ($endtime - $starttime); 
echo 'Rendered in ' .$totaltime. ' seconds.';

?>

This script takes roughly 0.76 seconds to run on my computer each time. Obviously there are a lot of factors that could make a difference in your specific case, but there is no meaningful performance difference in running MySQL queries versus using includes. (Note that I did not include the MySQL connection overhead in my test -- if you're connecting to the database only to get the included data, that would make a difference.)

Rafe
For a relevant benchmark, you should have only one include statement in the script and then run the script 10000 times, rather than having 10000 includes in a single PHP script and running the script once.For one thing, repeated includes do not lend themselves to opcode caching, because each inclusion could theoretically change how the next include should be parsed. In my tests, the include benchmark actually runs 7 times faster when APC is *disabled*. :-)
Søren Løvborg
+4  A: 

This is a pretty obvious case of premature optimization. Don't ever try to optimize things like this unless you've actually identified it as a real bottleneck in a production environment.

That said, using an opcode cache like APC (you are using an opcode cache, right? Because that is the very first thing you should do to optimize PHP), my money is on the include file.

But again, the difference will likely be neglible, so pick the solution which requires 1) the least code and 2) the least maintenance. Programmer time is much more expensive than CPU time.

Update: I did a quick benchmark of the inclusion of a PHP file defining a 1000-entry array. The script ran 5 times faster using APC than without.

A similar benchmark, fetching 1000 rows from a MySQL database (on localhost), only ran 15% faster using APC (since APC doesn't do anything for database queries).

However, once APC was enabled, there was no significant difference between using an include file and using a database.

Søren Løvborg
+1  A: 
grantwparks