tags:

views:

266

answers:

2

I've seen in a few places lately people saying that PHP has a poor capacity for recursion. Recently I wrote a recursive php function for graph traversal and found it to be very slow compared to java. I don't know whether this is because of php's capacity for recursion or because php is slower than java in general.

Some googling revealed this (http://bugs.php.net/bug.php?id=1901)

[7 Aug 1999 12:25pm UTC] zeev at cvs dot php dot net

PHP 4.0 (Zend) uses the stack for intensive data, rather than using the heap. That means that its tolerance recursive functions is significantly lower than that of other languages.

It's relatively easy to tell Zend not to use the stack for this data, and use the heap instead - which would greatly increase the number of recursive functions possible - in the price of reduced speed. If you're interested in such a setting, let me know, we may add a compile-time switch.

What does it mean to say that php uses the stack for intensive data? Does php not set up a run-time stack? Also, is it true in general that recursion in php is much slower than other languages? And by how much?

Thanks!

+2  A: 

Okay, I'll take a stab at it.

First: "The stack" is the area used for function call tracking in standard C/C++ type programs. It's a place the operating system and programming language conventions define in memory, and it's treated like a stack (the data structure). When you call a C function fibbonaci(int i) then it places the variable i, and the return address of the function that was already busy calling it, on the stack. That takes some memory. When it's done with that function call, the memory is available again. The stack is of finite size. If you are storing very massive variables on it and make many many recursive calls, then you may run out of room. Right?

So.....

Apparently Zend has two ways to allocate data: on the heap (a more-general area to request memory from) and on the stack, and the stack is a more efficient place to have them because of the way things are programmed. (I don't know why, but I can guess. There may be very-low-level caching concerns - I expect the stack is likelier to be in L1 or L2 cache than arbitrary memory in the heap would be, because the CPU is very likely to be using that area very frequently - every time you call a function, in fact. There may also be allocation overhead for heap data access.)

"Intensive" data in this context, I believe, refers to data which is very likely to be used very soon or very often. It would make sense to use the speedier stack-based allocation for these variables. What sort of variables are you certain to be using very quickly? Well, how about parameters to a function? You're very likely to use those: otherwise why would you be bothering to pass them around? They're also probably likelier to be small data items (references to massive data structures rather than massive data structures themselves - because that gives you copying overhead, among other things). So the stack probably makes sense for storing PHP function parameters for most PHP programmers... but it fails sooner in recursion.

Hopefully that answers at least "what does this mean?". For your recursion performance question: Go benchmark it yourself; it probably depends on what sort of recursion you're trying to do.

fennec
@Eric: on the subject of measuring performance, XDebug offers PHP profiling (http://xdebug.org/docs/profiler).
outis
A: 

At a guess, I'd say that your problem lies elsewhere than the recursion itself. For many things, Java is a lot faster than PHP. There are, sort of, ways to improve PHP's performance.

However, the PHP recursion limitation results in PHP running out of stack and crashing, with the dreaded 'stack overflow' message (pun sort of intended). At this point, your program ceases to execute.

If PHP is using a dynamic stack, you could see some (mild) slowdown due to the time it takes to realloc the stack to a larger block of memory.

Anyway, I'd need to know a bit more about what you're doing to pinpoint your performance problem, which is something I do for a living...

Perry Munger