views:

75

answers:

5

There are many scenarios where I've questioned PHP's performance with some of its functions, and whether I should build a complex class to handle specific things using its seemingly slow tools.

For example, Complex regular expressions with sed and processing with awk would seemingly be exponential in performance rather than making PHP's regular expression and seemingly excessive functions parse and in time manage to finish it. If I were to do a lot of network tasks such as MX lookups/DIGging/retrieving simultaneously I would rather pass it via system() and let the OS handle it itself. There are simply too many functions in PHP, that are inefficient and result in slow pages or can be handled easier by the OS.

What are your opinions?

Do you think I should do the hard work with the OS in its own/custom functions?

+3  A: 

System calls can very often be faster than using a solution built in PHP (although that doesn't always hold true, seeing as PHP's functions are themselves built and compiled in C. Many PHP core functions and extensions are pretty fast).

Apart from speed, a second factor is the memory limit. Externally called processes don't eat away at PHP's per-script limitation, which can be great when working with large files for example.

Also, some functions simply are not available in PHP itself. There is no way to imitate the feature set of ImageMagick entirely within PHP, for example. The GD library doesn't come close to what ImageMagick has to offer.

The big, big minus is that by using system commands, you effectively eliminate portability, which is part of PHP's beauty. Moving the application to a different server becomes a huge burden because the feature set of the external commands needs to be identical - and that isn't always the case even across different Linux distros, not to speak of crossing the OS border into Windows or Unix-based Mac OS. I have myself experienced issues with wget and ImageMagick in this respect, I'm sure there are many more.

If you are working on a custom application for which you entirely control the server environment (and the decision what kind of servers will be bought in the next five years), that may not be a problem. It will be one, though, if you build software that needs to be portable.

I personally tend to rather cut away a feature (that would need an external dependency) than lose portability, but then, I am in the trade of building portable applications very much. It really depends on your focus.

Pekka
On the topic of ImageMagick: checkout the PHP Imagick extension developer's blog for some cool examples: http://valokuva.org/?cat=1
janmoesen
A: 

I think that would be slower actually, because each time you call such a function the OS would start a new process, and that is time consuming.

clyfe
A: 

I'd say if your program is intended to be executed on the shell using other external programs like sed/awk is fine as shellscripts also make excessive use of external programs and a php script run on the shell is just like a shell script, just in another language. However, if it's a web application, better do it in php - most shared hosting environments don't allow you to execute external programs from php scripts.

ThiefMaster
+1  A: 

Even if system processes are faster and hog less memory (extensive testing is a must here), there's something to keep in mind:

I'd be cautious with using system() calls and only use it if you control the hardware your script will run on. Using those calls may result in the need to install further software / packages and may not work (the same way) on all OS, so if you can't control the server, I'd stick with PHP-functions.

Select0r
A: 

(My experience is that "system calls" usually refers to calling kernel operations - not invoking other programs - "pass it via system() and let the OS handle it" - you seem to think the same - but none of the programs you mention are OS services - they are just other programs).

PHP is essentially a scripting language - which conventionally are just a glue for moving data between other programs, but some things to consider:

1) performance - forking a new process can be computationally expensive

2) security - giving your webserver unlimited access to all the programs on the system (even constrained by permissions) is potentially very dangerous

3) bearing in mind (2) most configurations will prevent or limit what you can do

4) for large scale development this is rather dangerous - letting programmers write their code in any language of their choice then putting a skim of PHP over the top means you will end up with an application written in lots of different languages

5) You can write your own native code PHP extensions quite easily

If I were to do a lot of network tasks such as MX lookups/DIGging/retrieving

While I could believe that mashing up large data sets using awk/sed might be faster/more efficient then native PHP code, I find it a bit surprising that DNS lookups are faster using a different client. How did you measure this?

symcbean