views:

85

answers:

2

Now comes the hard part. How do you optimize this function:

function coin_matrix($test, $revs) {
    $coin = array();

    for ($i = 0; $i < count($test); $i++) {
        foreach ($revs as $j => $rev) {
            foreach ($revs as $k => $rev) {
            if ($j != $k && 
                $test[$i][$j] != null && 
                $test[$i][$k] != null) {
                    if(!isset($coin[$test[$i][$j]])) {
                        $coin[$test[$i][$j]] = array(); 
                    }
                    if(!isset($coin[$test[$i][$j]][$test[$i][$k]])) {
                        $coin[$test[$i][$j]][$test[$i][$k]] = 0; 
                    }

                    $coin[$test[$i][$j]][$test[$i][$k]] += 1 / ($some_var - 1);
                }
            }
        }
    }
    return $coin;
}

I'm not that good at this and if the arrays are large, it runs forever.

The function is supposed to find all pairs of values from a two-dim array and sum them like this: $coin[$i][$j] += sum_of_pairs_in_array_row / [count(elements_of_row) - 1]

Thanks a lot!

A: 

If you really need to be running all combinations of 3 items in the array, you are mostly stuck when it comes to large arrays. The main factor will be the fact that your function is cubic - you just can't do anything about the fact that, as input increases, output time will increase astronomically. You might be able to reduce the time to, say, 30% of where it is now, but if the time is already 3 weeks, 1 week of runtime does little for you.

That being said, you can still save yourself some effort inside the innermost loop. If I am interpreting your code correctly, you can set $coin[$i][$j] = array() just once per i-j combination, rather than check for each k item that it is not yet set.

That also being said, it's still not completely clear what the function is supposed to be doing, so I can't exactly offer other edits with confidence other than to use $coin instead of coin, to save the PHP parser a bit of effort.

Matchu
That's a typo. corrected
Alex
There are still two, in the if statements.
Matchu
yep, thanks! corrected also. see in the comments above for the description of the formula behind the function
Alex
A: 

I don't know why this wasn't pointed out earlier: Change:

for ($i = 0; $i < count($test); $i++) {

To:

$count = count($test);
for ($i = 0; $i < $count; $i++) {

Will save you quite some time. (more if $test is bigger)

I'm not sure if this is right, but it'd cause quite a slowdown as the $some_var doesn't exist:

$coin[$test[$i][$j]][$test[$i][$k]] += 1 / ($some_var - 1);

Finally, I'm still not sure what this is supposed to do. Maybe you can provide some good input and output values. As your stated purpose still doesn't make sense. What's $revs, why is it:

$coin[$i][$j] += sum_of_pairs_in_array_row / [count(elements_of_row) - 1]

instead of :

$coin[$row] += sum_of_pairs_in_array_row / [count(elements_of_row) - 1]
AlReece45
it computes a coincidence matrix for the Krippendorff's Alpha. the exact formula for each element of the matrix that the function is supposed to return is here: http://bit.ly/bS9a92 (wikipedia link).$test is a bi-dimensional matrix, $revs is a vector and the keys from $revs coincide with the row number of $test
Alex