tags:

views:

61

answers:

3

Hi,

I wanted to convert an array to lowercase and was wondering the most efficient method. I came up with two options, one using array_walk and one using foreach and wanted to compare them. Is this the best way to compare the two? Is there an even more efficient method that I have overlooked?

<?
$a = array_fill(0, 200000, genRandomString());
$b = array_fill(0, 200000, genRandomString());
$t = microtime(true);
array_walk($a, create_function('&$a', '$a = strtolower($a);'));
echo "array_walk: ".(microtime(true) - $t);
echo "<br />";
$t = microtime(true);
foreach($b as &$source) { $source = strtolower($source); }
echo "foreach: ".(microtime(true) - $t);


function genRandomString($length = 10) {
    $characters = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
    $string = '';    

    for ($p = 0; $p < $length; $p++) {
        $string .= $characters[mt_rand(0, strlen($characters)-1)];
    }

    return $string;
}

The output:

array_walk: 0.52975487709045
foreach: 0.29656505584717
A: 

I don't know PHP, so this is a wild guess:

str_split(strtolower(implode("", $a)))
Marcelo Cantos
Thanks for the suggestion, however this method will take up more memory than the rest.
Gazler
No worries. You should mention memory constraints in your question. Efficiency, when unqualified, is normally taken to mean speed.
Marcelo Cantos
To be fair, it was also slower. :)
Gazler
You mean it actually worked? That's great! ;-)
Marcelo Cantos
+2  A: 

Two questions in one!

How to run the tests:

Personally, I'd write individual test scripts for each method, then use the Apache ab utility to run the tests:

ab -n 100 -c 1 http://localhost/arrayWalkTest.php
ab -n 100 -c 1 http://localhost/foreachTest.php

That gives me a much more detailed set of statistics for comparison

I'd also try to ensure that the two methods were working on identical datasets for each test, not different random data.

The most efficient method:

You should unset($source) after your loop as a safety measure: because you're accessing by reference in the loop, $source will still contain a reference to the last entry in the array and may give you unpredictable results if you reference $source anywhere else in your script.

Mark Baker
Thanks, I was unaware of the ab utility, seems exactly what I need.
Gazler
ab is incredibly useful for this kind of test because you can set the number of times to execute the test and level of concurrency, and get a detailed statistical breakdown across the set of results allowing you to see the variances as well
Mark Baker
+1  A: 

I had lots of weird results in the past when using the microtime approach over using a dedicated profiler, like it exists in XDebug or Zend_Debugger. Also, for a fair comparison your arrays should be identical instead of two random arrays.

In addition, you could consider using array_map and strtolower:

$a = array_map('strtolower', $a);

which would save you the lambda for array_walk. Anonymous functions created with create_function (unlike PHP 5.3's anonymous functions) are known to be slow and strtolower is a native function, so using it directly should be faster.

I did a quick benchmark and I dont see any relevant speed difference between this approach and your foreach. Like so often, I'd say it's a µ-opt. Of course, you should test that in a real world application if you think it matters. Synthetic benchmarks are fun, but ultimately useless.

On a sidenote, to change the array keys, you can use

Gordon