views:

1300

answers:

20

What surprises have other people found with writing PHP web applications? There's the well known and to be fixed issue with compile time class inheritance but I know of a couple others and wanted to try and build a list of the top gotcha's of the language.

Note:

I've held several positions as a Sr. PHP5 developer so PHP work pays my bills, this question is not meant to bust on PHP as a language as every single language I've worked with has some well known or not so well known surprises.

+6  A: 

It was kind of obvious after the fact but a well known gotcha has to do with scope and references when used in foreach.

foreach($myArray as &$element){
   //do something to the element here... maybe trim or something more complicated
}
//Multiple lines or immediately after the loop

$element = $foobar;

The last cell in your array has now become $foobar because the reference in the foreach above is still in the current context scope.

David
PHP has no block scope, just function/class/global scope.
Gumbo
Granted, but this is no performance issue, is it? Maybe you should edit you question to include other pitfalls?
cg
@cg Good point... :)
David
The by-reference foreach() is more trouble than it's worth, just get the array key at the same time and update the array manually.
too much php
A common practice is to always `unset($element);` immediately after iterating an array by reference. Then you're safe later down the page. I've seen it done on the same line as the closing bracket even.
philfreo
+12  A: 

require_once and include_once can often result in major performance killers when used excessively. If your including/require a file that holds a class... a pattern like so can save some serious processing time.

class_exists("myFoo") or require("myFoo.someClass.php");

Update: This is still a issue - http://www.techyouruniverse.com/software/php-performance-tip-require-versus-require_once

Update: Read the selected answer for the following question: http://stackoverflow.com/questions/135373/would-performance-suffer-using-autoload-in-php-and-searching-for-the-class-file If implemented along these lines, you pretty much minimize as best as possible the penalties for file include/requires.

David
Really? I wasn't aware of this. Has anyone measured the difference beteen require_once() and your pattern?
cg
I find that hard to believe also - class_exists will involve some kind of hash lookup, and so does require_once
Paul Dixon
Autoloading (http://uk.php.net/autoload) is a cleaner and more flexible workaround for this.
Rob
David
Interesting! Though I do use autoloading myself :)
Paul Dixon
+17  A: 

I'm not sure if this counts, but the need to compile PHP scripts is a huge performance issue. In any serious PHP project you need some kind of compiler cache like APC, eAccelerator, PHP Accelerator, or the (commercial) Zend Platform.

cg
Definitely agree with that, that's why I believe PHP now includes APC as part of its standard set of extensions.
David
Oh yes, you're right. I forgot about APC!
cg
@David: I don't think it is included quite yet, but it is scheduled for inclusion in PHP6.
R. Bemrose
@R. Bemrose that's disappointing if its true
David
+2  A: 

Total memory while running PHP. Many large projects just include all the class files and use them when they need them. This adds to the total memory PHP needs to use for each run.

Also projects using Frames or IFrames as this could easily double your memory usage.

So employ a conditional loading of your class files, have nothing loaded that you aren't using

Ólafur Waage
Ignoring any other faults it might have, this is why I loved Code Igniter for so long because it tries to only include logic as needed.
David
I do not understand how frames tie into PHP - please explain.
Sander
The problem with iframes only exists if you your main page and your iframe page are initialized in the same way, and thus you include all the classes and database connections etc twice for every request (they are still separate requests, but you get the picture :)
Jan Hancic
+5  A: 

__autoload() proved to be a major landmine for me recently. Some of our legacy code and libraries use class_exists(), and it tries to autoload classes that were never meant to be loaded in that way. Lots of fatal errors and warnings. class_exists() can still be used if you have autoload, but the second parameter (new since PHP 5.2.0) has to be set to false

Jeremy DeGroot
__autoload is a cool idea but it also spooks me because once you start using it in a project, especially a large multi-team one, its hard to stop using it.
David
+2  A: 

Performance issues with PHP apps are usually one of the following:

  • File system access - reading and writing to disk
    • This is where APC, eAccelerator, etc come in handy, they reduce file system access by caching parsed PHP files in memory
  • Database - slow queries, large datasets
  • Network I/O - accessing external resources

It's quite rare to run into performance issues with PHP (or any web app written in any language). The above issues are usually orders of magnitude slower than code execution.

As always, profile your code!

Ryan Doherty
+4  A: 

Not being aware of the operator precedence can cause some problems:

if ($foo = getSomeValue() && $bar) {
    // …
}
// equals
if ($foo = (getSomeValue() && $bar)) {
    // …
}
Gumbo
I just recently did something to that effect... "How the F^%$ is this turning into a bool?"
David
That's why one should always use parentheses when in doubt
Imran
Assigning values inside a conditional test isn't exactly good form to start with.
Henrik Paul
@Henrik Paul: But that’s common practice, even in other languages: `while (line = readline(file)) { … }`
Gumbo
+4  A: 

The big gotcha I've seen people fall prey to is precision (in php and other languages).

If you want a bit of fun compare any float to a whole with >= and find out how many times you get the expected result.

This has been the downfall of many people working with money inside of PHP and trying to make logic decisions based on comparisons that do not allow rounding to a whole number.

For example - fabric

Fabric is sold in units of 1 yard or 1 half yard as well as maintaining an inventory of exact measurement left of the fabric.

If this system isn't expressed in whole numbers and instead is expressed in floating points it will make it incredibly hard to make solid decisons.

Your best bet is to express 1 half yard as 1, for example if you have 300 yds of fabric, you would have an inventory of 600 (600 half yard units).

Anyways, thats my gotcha - time to refactor 4 months of programming due to not understanding precision....

Syntax
Floats are particularly nasty in PHP, because of the weak typing. You can get really odd results sometimes.
troelskn
Yep - they almost seem random at times as to which way the operators decide to go.
Syntax
It took me by surprise recently when I did a more math related task... I had assumed that since PHP had modula (%) that it would not convert an integer to a float. Fortunately caught that one when debugging my code with xdebug.
David
+5  A: 
  • foreach() is silently copying the array in the background and iterating thru that copy. If you have a large array this will degrade performance. In those cases, the by-reference options of foreach() that are new to php5 or use a for() loop.

  • Be aware of equality (==) vs. identity (===).

  • Be aware of what constitutes empty() vs. what constitutes isset().


More landmines now that I have some more time:

  • Don't compare floats for equality. PHP isn't matlab and it simply isn't designed for precise floating point arithmetic. Try this one:
if (0.1 + 0.2 == 0.3)
  echo "equal";
else
  echo "nope"; // <-- ding ding
  • Similarly, don't forget your octals! An int w/ a leading zero is cast as an octal.
if (0111 == 111)
  echo "equal";
else
  echo "nope"; // <-- ding ding
Encoderer
iirc, php arrays are copy-on-write, so iterating over it using foreach() won't incur any extra memory unless you modify them.References should be avoided, they're a major landmine.
Richard Levasseur
Not to be a prick, but foreach() makes a copy and iterates that. Whether or not you modify that. Iterating byref inside foreach() prevents that, actually.
Encoderer
Note: Unless the array is referenced, foreach operates on a copy of the specified array and not the array itself. foreach has some side effects on the array pointer. Don't rely on the array pointer during or after the foreach without resetting it. - http://us3.php.net/foreach. ydnrc.
Encoderer
Disagreeing doesn't make you a prick :). PHP modifying the internal array cursor doesn't cause the copy to occur, though. An external action will, like reset($a) or $a[$k]=$v. A quick script verifies.
Richard Levasseur
Rich, this is verifiable. Read that excerpt from PHP Docs. I copy and pasted it for you "Unless the array is referenced, foreach operates on a copy of the specified array and not the array itself." And if you really want, run it under a debugger. You can see the copy happen in zend_vm_execute.h
Encoderer
@Encoderer this is somewhat terrifying to me as I think of how much I rely on foreach in my code.
David
+2  A: 

Another pitfall in PHP, ive seen this error from people who come from other languages but not often.

<?php
/**
 * regular
 */
echo (true && true); // 1
echo (true && false); // nothing

echo (true || false); // 1
echo (false || false); // nothing

echo (true xor false); // 1
echo (false xor false); // nothing

/**
 * bitwise
 */
echo (true & true); // 1
echo (true & false); // 0

echo (true | false); // 1
echo (false | false); // 0

echo (true ^ false); // 1
echo (false ^ false); // 0
?>
Ólafur Waage
Erm... what's the error?
chaos
No error, just a pitfall of the language that some people fall into.
Ólafur Waage
What I mean is, how would people be using this that would get them into trouble?
chaos
When they do a === check for 0 or empty and work on code that uses both bitwise and regular checks for some reason.
Ólafur Waage
+5  A: 

The @ error silencer should always be avoided.

An example:

// Don't let the user see an error if this unimportant header file is missing:
@include 'header.inc.php';

With the code above, you will never know about any errors in any of the code in header.inc.php, or any of the functions called from header.inc.php, and if there is a Fatal Error somewhere, your web page will halt with no way to find out what the error was.

too much php
I agree with you and know why, but could you explain in your answer what some of the pitfalls with (ab)using the @ symbol.
David
Jan Hancic
Reading from @$_POST is the *only* safe example that I know of.
too much php
This is a good rule for newbies. But for an experienced developer who has more developed debugging abilities, there are many valid reasons for suppressing certain errors. This is especially true when you implement a custom error handler. And any app really should have a custom handler.
Encoderer
Use my magic "value()" function: http://stackoverflow.com/questions/55060/php-function-argument-error-suppression-empty-isset-emulation/1867434#1867434 and the "@" in php should be a thing of the past.
Bob Fanger
+12  A: 

Recursive references leak memory

If you create two objects and store them inside properties of each other, the garbage collector will never touch them:

$a = new stdClass;
$b = new stdClass;
$a->b = $b;
$b->a = $a;

This is actually quite easy to do when a large class creates a small helper object which usually stores the main class:

// GC will never clean up any instance of Big.
class Big {
  function __construct() {
    $this->helper = new LittleHelper($this);
  }
}
class LittleHelper {
  function __construct(Big $big) {
    $this->big = $big;
  }
}

As long as PHP is targeted at short fast page requests, they are not likely to fix this issue. This means that PHP can't be depended on for daemons or other applications that have a long lifespan.

too much php
Circular garbage collector should be included in PHP 5.3: http://www.ibm.com/developerworks/opensource/library/os-php-5.3new1/index.html#N101D4
OIS
You should change this to say "Circular references", the common term.
erikkallen
+1  A: 

Just thought of one more surprise. array_map which applies a callback to an array, is a serious performance killer. I'm not totally sure why, but I think it has something to do with PHP's copy on write mechanism for loops.

David
OIS: interesting, which version of PHP?
David
array_map is for 2+ arrays it seems. Its not faster then coding your own loop for one array. Good catch.
OIS
+10  A: 

NULL and the "0" string are pure evil in Php

if ("0" == false) //true
if ("0" == NULL)  //true
if ("0" == "NULL")//true
Robert Gould
That 2nd example seems fishy to me: Null compared to ANYTHING is Null. It's the absence of value, you can't do an equality comparison against it.
Encoderer
The problem is NULL will be converted to a "0" string... why, no idea, but its in the specs...
Robert Gould
That's why you should use strict comparison (===)
Jan Hancic
definitely, but when you're just starting out, and especially when coming from languages without the ===, say C or C++, this is really frustrating
Robert Gould
+1 definitely agree this can be a surprise for coders moving into scripting land :)
David
To check for "has a value" I recommend using `strlen()`. This does produce the correct results with form input.
vdboor
+3  A: 

My favorite PHP gotcha:

Consider this include:

# ... lots of code ...
$i = 42;
# ... more code ...

Then use this include somewhere:

for($i = 0; $i < 10; $i++){
    # ...
    include 'that_other_file.php';
}

Then try to guess how many times the loop runs. Yup, once. Lexical scoping (and proper dynamic scoping) are both solved problems. But not in PHP.

jrockway
I haven't tested this, but I would have thought it would run 10 times, but $1 would be 10 after the for exits.
R. Bemrose
The post could do with some editing to get the point across it seems. that_other_file.php sets $i to a value which will affect the encasing for loop. If its set to a value lower then 10 you got an infinite loop. Never include a file (or eval code, which is dangerous and slow anywhere) in a loop directly.
OIS
+11  A: 

A fun landmine: Global variables can affect $_SESSION when register_globals is on. But i guess thats what happens when register_globals, a land mine itself, is turned on.

Richard Levasseur
Wouldn't the real landmine be register_globals itself? Also, you should check again on that foreach silent copy comment.
Encoderer
+1 for register_globals as one of PHP's major landmines.
cg
Recently had to maintain a project that used `register_globals` throughout. I was ready to hang myself by the end of it.
tj111
+3  A: 

If you're used to languages with intelligent logical operators, you will try to do things like:

$iShouldTalkTo = $thisObj || $thatObj;

In PHP, $iShouldTalkTo is now a boolean value. You're forced to write:

$iShouldTalkTo = $thisObj ? $thisObj : $thatObj;

Out of all the examples of how early design decisions in PHP tried to hold the hands of incompetent programmers in exchange for hobbling competent ones, that may be the one that irritates me the most.

Deep brain-damage in the switch() construct abounds. Consider this:

switch($someVal) {
case true  :
    doSomething();
    break;
case 20    :
    doSomethingElse();
    break;
}

Turns out that doSomethingElse() will never be called, because 'case true' will absorb all true cases of $someVal.

Think that's justifiable, perhaps? Well, try this one:

for($ix = 0; $ix < 10; $ix++) {
    switch($ix) {
    case 3  :
        continue;
    default :
        echo ':';
    }
    echo $ix;
}

Guess what its output is? Should be :0:1:2:4:5:6:7:8:9, right? Nope, it's :0:1:23:4:5:6:7:8:9. That is, it ignores the semantics of the continue statement and treats it as a break.

chaos
I vaguely remember some sort of talk about adding `break #;` syntax. Maybe they've "fix" this with a `continue #` as well?
David
continue 2; // fix
OIS
$iShouldTalkTo = $thisObj or $thatObj; But ternary is more clear.
OIS
@OIS: Yeah, for a minute I thought that would work. It doesn't. Even worse, it acts close enough to right that you think it is working, but the precedence of 'or' is *too high* for it to. Do some testing on it.
chaos
+1  A: 

in the very beginning one could spent a lot of time debugging that kind of code:

$a = 1;
echo $a;      # 1
echo "$a";    # 1
echo '$a';    # $a

damn quotes! very frustrating :(

SilentGhost
Use an editor/ide which highlights variables in code, or at least colors single and double quoted strings differently.
OIS
highlighting might be a clue, but you need to know the difference yourself
SilentGhost
why the downvote?
SilentGhost
+2  A: 

Not getting compiler messages for if/else branches:

if( $foo )
{
  some_function();
}
else
{
  non_existing_function();   // oops!
}

PHP won't mention that non_existing_function does not exist until you enter a situation where $foo is false.


Forgetting to set:

error_reporting( E_ALL );

So notices are not caught, spending time debugging:

  • non existing variables
  • invalid object properties
  • invalid array keys

Pasting strings together of different "types" / sources, without escaping them:

// missing mysql_real_escape_string() or an int cast !
$sql = "SELECT * FROM persons WHERE id=$id";

// missing htmlentities() and urlencode() !
$html = "<a href='?page=$id'>$text</a>";  
vdboor
+1  A: 

As per http://stackoverflow.com/questions/3117604/why-is-calling-a-function-such-as-strlen-count-etc-on-a-referenced-value-so-sl/3117608

If you pass in a variable to a function by reference, and then call a function on it, it's incredibly slow.

If you loop over the function call and the variable is large it can be many orders of magnitude slower than if the variable is passed by value.

Example:

<?php
function TestCount(&$aArray)
{
    $aArray = range(0, 100000);
    $fStartTime = microtime(true);

    for ($iIter = 0; $iIter < 1000; $iIter++)
    {
        $iCount = count($aArray);
    }

    $fTaken = microtime(true) - $fStartTime;

    print "took $fTaken seconds\n";
}

$aArray = array();
TestCount($aArray);
?>

This consistently takes about 20 seconds to run on my machine (on PHP 5.3).

But if I change the function to pass by value (ie function TestCount($aArray) instead of function TestCount(&$aArray)), then it runs in about 2ms - literally 10,000 times faster!

The same is true for any function that passes by value - both built-in functions such as strlen, and for user-defined functions.

This is a rather scary tarpit that I was previously unaware of!

Fortunately there's a simple workaround that is applicable in many cases - use a temporary local variable inside the loop, and copy to the reference variable at the end.

therefromhere
You can in most cases temporarily save it in a non-referenced var, and assign it at the end.
Dykam
@Dykam - yeah, I've just added a note to reflect that.
therefromhere