views:

129

answers:

5

I've just noticed that my app is including over 148 php files on one page. Bear in mind this is the back end admin and not the main site, but is this too many? What impact does a large number of includes have on a server, both whilst under average load and stressed? Would disk I/o be a problem?

Included File Stats

File Type - Include Count - Combined File Size

  • Index - 1 - 0.00169 MB
  • Bootstrap - 1 - 0.01757 MB
  • Helper - 98 - 0.58557 MB - (11 are Profiler related classes)
  • Configuration - 8 - 0.00672 MB
  • Data Store - 23 - 0.10836 MB
  • Action - 8 - 0.02652 MB
  • Page - 1 - 0.00094 MB
  • I18n Resource - 7 - 0.00870 MB
  • Vendor Library - 1 - 0.02754 MB
  • Total Files - 148 - 0.78362 MB

Time ran 0.123920917511

Memory used 2.891 MB


Edit 1. Should be noted that this is a worst case scenario page. It has many different template models, controllers and associated views because it handles publishing with custom fields.

Edit 2. Also the frontend has agressive page caching so the number of includes in the front is roughly 30-40 at the moment.

Edit 3. Profiler when turned off won't include the files so this will reduce quite a few includes

+2  A: 

Yes, so many files can be a problem.

No, it is probably not a problem in your case, since this is only a back-end, which is probably accessed by a few people, and not too often.

In general, I would discourage having more than 20 PHP files called on each page. This is because even the website and the server are highly optimized, for every page, the server must go and look at every file to see at least if it changed since the last request (if there is no cache implemented on this level).

Even if the time to access a file is tiny, it is a time you are loosing at each request. This tiny period of time multiplied by 148 can become an issue (and a huge scalability problem).

When I worked on a PHP framework project, I used a trick to reduce the number of files. Several files were combined to one minified file, and this single file was cached. Then, if there was a need to update the framework or the website, the cached file was automatically removed, then rebuilt.

Even if I personally discourage you to minify the source code (because it is difficult to do, difficult to test, and creates a bunch of problems, like the meaningless numbers of lines in errors), you can probably do the same thing by combining all your files into a single file.

Be careful: if a page A uses half of those files, and page B - another half, combining everything will probably decrease the performance, since PHP engine will have to parse more code.

MainMa
Neat idea. Do you have any figures from the performance gain?
Mike B
I had thought about doing something like that, but then thought if an opcode cache were primed for the app, would it actually make a difference to combine common Helper classes into one main helper cache, and the config? One point about all the helper files separated was that it was supposed to load and process less code, but as the app has got bigger I suppose it's got a bit bloated. It would be interesting to know if an opcode cache negates high numbers of include files.
buggedcom
@Mike B: I remember measuring performance, there was a tiny performance gain, just a few milliseconds. But the framework had approximately 100 files, and only 10-25 (combined into one) were used on every website, so the performance statistics were quite useless on this point.
MainMa
+1  A: 

Are the includes themselves doing something fancy, like db queries? And are they all at the top of the page, or are they included as-needed?

Those stats don't look bad, so, if admin access is infrequent, you may be ok. But you should examine this from a design angle: can things can be organized in a way that would prevent you from having to maintain so many includes? Separate from any performance issues, there is a risk here of creating hard-to-track dependency bugs.

(It could be as MainMa said, related to a framework, in which case you may have no control over the above. I only mention it in case you do.)

A couple things in case you didn't know already:

  • If it's just text or static HTML, you can get the contents with file_get_contents(), readfile(), etc. This is somewhat faster because the loaded file doesn't need parsing. But obviously if it contains PHP code this won't help.
  • You can use include_once() to prevent the same file from being included twice (if, for instance, it's included by two files that are themselves included by the top level file).
John C
Yeah, The included files are doing many different things. Mainly its the Helper files that are autoloaded rather than loaded via include_once (that has performance penalties of it's own), and the framework was designed that way on purpose so it only loads what it needs when it needs it. So there's no real dependency risk there either. But there are common helpers used on every page which I guess could be combined into a main include.That page example was one of the worst. It used many different template files for different types of input fields.In general the average page has about 60-70 files
buggedcom
The include_once and require_once performance issues applied to *PHP4 only*.
Charles
Have you got any reading material regarding that? I've never read anything about it being php4 only. However I just did a quick search and read this - http://www.techyouruniverse.com/software/php-performance-tip-require-versus-require_once - the comments are the most interesting particularly the last one. Doesn't mention 4/5 but basically says the time difference is irrelevant with opcode caches.
buggedcom
A: 

Disk I/O won't be your problem. The system will cache frequently accessed files in RAM, or if they aren't that frequently accessed, it won't matter.

Load times may be an issue, as each file has to be requested and interpreted by the server separately.

I don't know how the web server will cope with the many requests; it may not care. If the client doesn't do pipelined requests though, you'll pay for many many TCP connections built up and torn down, which also costs a goodly amount of latency.

Slartibartfast
Are load times and TCP requests relevant to included PHP scripts and classes?
buggedcom
Hmm. I assumed that by 'include' you meant that the browser would have to load the relevant items. If I'm wrong and that part is all server side, then no, not really.
Slartibartfast
+4  A: 

So, here's a breakdown of the potential problems.

The number of files itself is an issue. Unless you're using a bytecode cache (and you are), and that cache is configured to not stat the file prior to pulling in the compiled bytecode, PHP is going to stat every single one of those files on include, then read them in. In some cases, that can also mean path resolution and a naive autoloader that pokes and prods at numerous directories. This won't be "slow" because the OS will surely have things cached if the files are hit frequently, but it does add precious milliseconds to each request.

If every autoloader is designed properly and the codebase relies entirely on the autoloader to pull in the required classes (meaning nothing uses include/require/include_once/require_once on a class file), you can avoid having to open and read many of the files by gluing every single class together into a single large include. This is a bit on the impractical side of things, mainly because if there is no bytecode cache, PHP still has to parse, compile and interpret it all. Additionally, not every class is going to be used on every request, so it may be a bit wasteful.

The bottom line is that a well-configured bytecode cache will completely mitigate this problem. There's nothing wrong with telling your customers that they have to properly configure their servers for optimal performance. If they know what they're doing, they'll have everything correct to begin with.

Charles
Ok I get what you're saying and I agree with there is nothing wrong on saying that they have to install a cache for performance. I don't think they will complain about that much, but I have had a few server techs in larger companies basically so no because they have no one in the company who knows about that side of PHP so they can't support it. It is ridiculously stupid when you think about it especially given the performance gains. I also think that I will probably create a config option that automatically compiles common Helper classes into on big include for those stubborn customers.
buggedcom
You should consider a professional installation service with the software, for an extra fee. ;)
Charles
A: 

Honestly, don't worry about it - 148 is nothing, even if 0 caching happened at php side you're going to be hitting fs caches almost everytime - and in the grand scheme of things virtually every opensource anything out there has way more files without a problem (drupal, wordpress, joomla, elgg, anything).

Really, no problem here - even if you managed to shave a millisecond here or there off, it's so far down the priority list and places where you can make speed gains it's barely worth considering for more than a second.

caveat: do try to use require_once and include_once where suited and ensure you only load those classes/files that are needed for a given request to process.

nathan