views:

41

answers:

2

I've seen both before, and as far as I know, it's pretty much subjective, but if given the option, which would you do and why? If the data were large, would there be any speed/memory benefit to one of them?

function processData(&$data_to_process) { // Pass by reference.
    // do something to the data
}

// ... somewhere else

$this->processData($some_data);

or

function processData($data_to_process) { // Pass by value.
    // do something to the data
    return $data_to_process;
}

// ... somewhere else

$some_data = $this->processData($some_data);
A: 

It's redundant, in most cases, to return a reference that the caller already has, which is what happens in your second example. The only case that I can think of where it is useful is for chaining method calls.

Generally, use a reference parameter when the function will change the state of that parameter, and use the return value to introduce something new to the caller.

In PHP the second isn't a pass by reference, it's a pass by value, so modifying it won't edit the data. I know that's not true of all languages (ex Java), so I guess this question doesn't necessarily apply to every one.
kevmo314
+4  A: 

PHP copies on write, so if the data doesn't change in the function, using a reference only makes things run slower.

In your case, you are changing the data, so a copy will occur. Test with the following:

<?php

define('N', 100000);
$data = range(1, N);
srand(1);

function ref(&$data)
{
        $data[rand(1, N)] = 1;
}

function ret($data)
{
        $data[rand(1, N)] = 1;
        return $data;
}

echo memory_get_usage()."\n";
echo memory_get_peak_usage()."\n";

ref($data);
// $data = ret($data);

echo memory_get_usage()."\n";
echo memory_get_peak_usage()."\n";

?>

Run it once with ref() and once with ret(). My results:

ref()

  • 8043280 (before / current)
  • 8044188 (before / peak)
  • 8043300 (after / current)
  • 8044216 (after / peak)

ret()

  • 8043352 (before / current)
  • 8044260 (before / peak)
  • 8043328 (after / current)
  • 12968632 (after / peak)

So, as you can see, PHP uses more memory when modifying the data in the function and returning it. So the optimal case is to pass by reference.

However, passing by reference can be dangerous if it's not obvious that it is occurring. Often you can avoid this question altogether by encapsulating your data in classes that modify their own data.

Note that if you use objects, PHP5 always passes them by reference.

konforce
Alright, cool. Thanks for answering both parts of the question btw.
kevmo314