tags:

views:

119

answers:

5

After some work in C and Java I've been more and more annoyed by the wild west laws in PHP. What I really feel that PHP lacks is strict data types. The fact that string('0') == (int)0 == (boolean)false is one example.

You cannot rely on what the data type a function returns is. You can neither force arguments of a function to be of a specific type, which might lead to a non strict compare resulting in something unexpected. Everything can be taken care of, but it still opens up for unexpected bugs.

Is it good or bad practice to typecast arguments received for a method? And is it good to typecast the return?

IE

public function doo($foo, $bar) {
   $foo = (int)$foo;
   $bar = (float)$bar;
   $result = $bar + $foo;
   return (array)$result;
}

The example is quite stupid and I haven't tested it, but I think everyone gets the idea. Is there any reason for the PHP-god to convert data type as he wants, beside letting people that don't know of data types use PHP?

A: 

No, it's not good to typecast because you don't know what you'll have in the end. I would personally suggest using functions such as intval(), floatval(), etc.

Tomasz Kowalczyk
How so? Either the cast will succeed and you'll have the correct type value, or execution will fail (an exception will be thrown or PHP will bomb out, depending on PHP version). Is there something I'm missing here?
Billy ONeal
I as well is curious about what Billy says. intval() seams to do just the same as type casting with (int)
Anders
+2  A: 

You can use type hinting for complex types. If you need to compare value + type you can use "===" for comparison.

(0 === false) => results in false
(0 == false) => results in true

Also you write return (array)$result; which makes no sense. What you want in this case is return array($result) if you want the return type to be an array.

halfdan
I'm aware of the flaws in the example. It's the way I would do it in C, but as you say I would definitely not do it in PHP. It was mostly for the consistency in the example. I'm aware of type hinting, but it does not work for primitive types :( It will just look for class "int" instead
Anders
+1  A: 

I don't think it's bad, but I would go one step further: Use type hinting for complex types, and throw an exception if a simple type isn't one you expect. This way you make clients aware of any costs/problems with the cast (such as loss of precision going from int -> float or float -> int).

Your cast to array in the above code there though is misleading -- you should just create a new array containing the one value.

That all said, your example above becomes:

public function doo($foo, $bar) {
   if (!is_int($foo)) throw new InvalidArgumentException();
   if (!is_float($bar)) throw new InvalidArgumentException();
   $result = $bar + $foo;
   return array($result);
}
Billy ONeal
I think that this is a great improvement to my thoughts. Is there any overhead to be concerned about?
Anders
Yeah, calling a function in PHP is slow. But as premature optimization is the root of all evil don't think about these minor, unimportant slowdown, but think about readability and maintainability of your code ;)
nikic
@Anders: Yes, there is overhead, but I don't think it significant here. PHP has to check the type in order perform the cast in any case. Of course, the check is not free though.
Billy ONeal
I think I'll try this way out. Anyway the database queries and transport of those results are always where I find the significant bottlenecks on my site
Anders
+3  A: 

The next version of PHP (probably 5.4) will support scalar type hinting in arguments.

But apart from that: Dynamic type conversion really isn't something you should hate and avoid. Mostly it will work as expected. And if it doesn't, fix it by checking it is_* of some type, by using strict comparison, ..., ...

nikic
I'm not sure if that was agreed (then again, on php.internals it only takes 2 people talking in private to arrange a release). While there is a patch for it I'm not sure if it was agreed that it would be in PHP.
Ross
Err.. php already supports type hinting (and has for a while).
Billy ONeal
@Billy: It didn't support scalar type hints and that's what we're talking about here.
nikic
@nikic: +1 for edit.
Billy ONeal
May we also see type hinting for the return of a method or will they only go half way?
Anders
Right now only typehinting for function arguments is supported. Though there already is a RFC for function return value type hinting for some years. Maybe they'll land it ;)
nikic
A: 

For better or worse, loose-typing is "The PHP Way". Many of the built-ins, and most of the language constructs, will operate on whatever types you give them -- silently (and often dangerously) casting them behind the scenes to make things (sort of) fit together.

Coming from a Java/C/C++ background myself, PHP's loose-typing model has always been a source of frustration for me. But through the years I've found that, if I have to write PHP I can do a better job of it (i.e. cleaner, safer, more testable code) by embracing PHP's "looseness", rather than fighting it; and I end up a happier monkey because of it.

Casting really is fundamental to my technique -- and (IMHO) it's the only way to consistently build clean, readable PHP code that handles mixed-type arguments in a well-understood, testable, deterministic way.

The main point (which you clearly understand as well) is that, in PHP, you can not simply assume that an argument is the type you expect it to be. Doing so, can have serious consequences that you are not likely to catch until after your app has gone to production.

To illustrate this point:

<?php

function displayRoomCount( $numBoys, $numGirls ) {
  // we'll assume both args are int

  // check boundary conditions
  if( ($numBoys < 0) || ($numGirls < 0) ) throw new Exception('argument out of range');

  // perform the specified logic
  $total = $numBoys + $numGirls;
  print( "{$total} people: {$numBoys} boys, and {$numGirls} girls \n" );
}

displayRoomCount(0, 0);   // (ok) prints: "0 people: 0 boys, and 0 girls" 

displayRoomCount(-10, 20);  // (ok) throws an exception

displayRoomCount("asdf", 10);  // (wrong!) prints: "10 people: asdf boys, and 10 girls"

One approach to solving this is to restrict the types that the function can accept, throwing an exception when an invalid type is detected. Others have mentioned this approach already. It appeals well to my Java/C/C++ aesthetics, and I followed this approach in PHP for years and years. In short, there's nothing wrong with it, but it does go against "The PHP Way", and after a while, that starts to feel like swimming up-stream.

As an alternative, casting provides a simple and clean way to ensure that the function behaves deterministically for all possible inputs, without having to write specific logic to handle each different type.

Using casting, our example now becomes:

<?php

function displayRoomCount( $numBoys, $numGirls ) {
  // we cast to ensure that we have the types we expect
  $numBoys = (int)$numBoys;
  $numGirls = (int)$numGirls;

  // check boundary conditions
  if( ($numBoys < 0) || ($numGirls < 0) ) throw new Exception('argument out of range');

  // perform the specified logic
  $total = $numBoys + $numGirls;
  print( "{$total} people: {$numBoys} boys, and {$numGirls} girls \n" );
}

displayRoomCount("asdf", 10);  // (ok now!) prints: "10 people: 0 boys, and 10 girls"

The function now behaves as expected. In fact, it's easy to show that the function's behavior is now well-defined for all possible inputs. This is because the the cast operation is well-defined for all possible inputs; the casts ensure that we're always working with integers; and the rest of the function is written so as to be well-defined for all possible integers.

Rules for type-casting in PHP are documented here, (see the type-specific links mid-way down the page - eg: "Converting to integer").

This approach has the added benefit that the function will now behave in a way that is consistent with other PHP built-ins, and language constructs. For example:

// assume $db_row read from a database of some sort
displayRoomCount( $db_row['boys'], $db_row['girls'] ); 

will work just fine, despite the fact that $db_row['boys'] and $db_row['girls'] are actually strings that contain numeric values. This is consistent with the way that the average PHP developer (who does not know C, C++, or Java) will expect it to work.


As for casting return values: there is very little point in doing so, unless you know that you have a potentially mixed-type variable, and you want to always ensure that the return value is a specific type. This is more often the case at intermediate points in the code, rather than at the point where you're returning from a function.

A practical example:

<?php

function getParam( $name, $idx=0 ) {
  $name = (string)$name;
  $idx = (int)$idx;

  if($name==='') return null;
  if($idx<0) $idx=0;

  // $_REQUEST[$name] could be null, or string, or array
  // this depends on the web request that came in.  Our use of
  // the array cast here, lets us write generic logic to deal with them all
  //
  $param = (array)$_REQUEST[$name];

  if( count($param) <= $idx) return null;
  return $param[$idx];
}

// here, the cast is used to ensure that we always get a string
// even if "fullName" was missing from the request, the cast will convert
// the returned NULL value into an empty string.
$full_name = (string)getParam("fullName");

You get the idea.


There are a couple of gotcha's to be aware of

  • PHP's casting mechanism is not smart enough to optimize the "no-op" cast. So casting always causes a copy of the variable to be made. In most cases, this not a problem, but if you regularly use this approach, you should keep it in the back of your mind. Because of this, casting can cause unexpected issues with references and large arrays. See PHP Bug Report #50894 for more details.

  • In php, a whole number that is too large (or too small) to represent as an integer type, will automatically be represented as a float (or a double, if necessary). This means that the result of ($big_int + $big_int) can actually be a float, and if you cast it to an int the resulting number will be gibberish. So, if you're building functions that need to operate on large whole numbers, you should keep this in mind, and probably consider some other approach.


Sorry for the long post, but it's a topic that I've considered in depth, and through the years, I've accumulated quite a bit of knowledge (and opinion) about it. By putting it out here, I hope someone will find it helpful.

Lee