views:

424

answers:

2

I have an application that iterates over an array every step and I seem to be getting surprisingly slow results when the array is empty. So, I investigated with some follow-up tests that went something like this:

NSMutableArray* ar = [NSMutableArray array];
double time = CFAbsoluteTimeGetCurrent();
for (int i = 0; i < 10000; i++)
{
    for (NSObject* obj in ar)
    {
        [obj retain];
        [obj release];
    }
}
time = CFAbsoluteTimeGetCurrent() - time;   
printf("Empty Time: %1.12f", time / 10000.0f);

time = CFAbsoluteTimeGetCurrent();
for (int i = 0; i < 10000; i++)
{
    if ([ar count] > 0)
    {
        for (NSObject* obj in ar)
        {
            [obj retain];
            [obj release];
        }
    }
}
time = CFAbsoluteTimeGetCurrent() - time;   
printf("Checked Time: %1.12f", time / 10000.0f);

I tried this for 100 | 1,000 | 10,000 iteration intervals with the following results:

Empty Time: 0.000000039935          //100
Checked Time: 0.000000020266        //100
Empty Time: 0.000000018001          //1000
Checked Time: 0.000000011027        //1000
Empty Time: 0.000000015503          //10000
Checked Time: 0.000000008899        //10000

Strangely, this shows that having the simply count check significantly improves performance on low-iteration runs (probably because of caching schemes). This is absolutely astonishing for me as I expected the Objective-C compile/runtime to already do this check whenever a foreach loop is executed! Does anyone have any idea why this might be the case and if there is any way to squeeze even more performance out of this loop setup? Thanks!

+7  A: 

Empty arrays aren't very common in a typical Cocoa program, nor would iterating an empty array 10s of thousands of times.

It would be exceptionally surprising to ever see enumeration of empty arrays show up in Instruments as a significant consumer of CPU cycles.

Given that the Foundation and Core Foundation are optimized to real world performance patterns, it isn't surprising that no 0 count check is made.

However, if you really must iterate an empty array a bazillion times, the fastest way is to use a block:

time = CFAbsoluteTimeGetCurrent();
[ar enumerateObjectsUsingBlock:^(id obj, NSUInteger idx, BOOL *stop) {
    [obj retain];
    [obj release];
}];

I pasted your code into a Foundation tool's main() and got this on a relatively recent MacBook Pro:

     Empty Time: 0.000000019896
   Checked Time: 0.000000007498
     Block Time: 0.000000000298

Of course, instead of empty arrays, just use nil. I.e. I did all the tests a second time after ar = nil;.

ar = nil;
time = CFAbsoluteTimeGetCurrent();
for (int i = 0; i < 10000; i++)
{
    for (NSObject* obj in ar)
    {
        [obj retain];
        [obj release];
    }
}
... etc ...


      Empty Time: 0.000000019902
    Checked Time: 0.000000007999
      Block Time: 0.000000000298
  nil Empty Time: 0.000000015599
nil Checked Time: 0.000000004703
  nil Block Time: 0.000000000000

Overall, though, if your data structures are that complex and you are pounding on them that much on every frame render, I'd suggest a different data structure might be in order.

Of course, only if you've actually used Instruments to sample the code and are optimizing something that is consuming a significant percentage of overall CPU cycles.

bbum
HOLY POO! The block implementation reduced execution time by a factor of 100?! It must be using GCD for that kind of performance, yes? That's pretty incredible I will have to look into that. Thanks! Oh, yeah, and I'm building a game, so iterating over empty arrays happens EVERYWHERE every frame, so performance for empty arrays is important. Fortunately, the count check wasn't as bad as I expected (something like 20 lines of boiler-plate code). Thanks again!
Grimless
Strange. I tried your block implementation and it actually tripled the execution time! Here's what I got: Checked Time: 0.000000009954 Empty Time: 0.000000016987 Block Time: 0.000000037014. NOTE: this was done 1000 times, so it's possible that block creation is really what killed this. EDIT: Yeah, so I got rid of the fori loop and tried the straight-up block run, and it reduced run time by 2x. Nice solution!
Grimless
OOPS! Forgot to adjust the other loops. Yeah, so the block implementation increased run time by 3x...Ouch.
Grimless
The behavior might be quite a bit different on a device vs. on Snow Leopard.
bbum
After testing extensively with nil, turns out that iterating over a nil array vs. checking for an empty array is almost identical in performance. Oh, well. I still like the lazy loading approach of only creating the array when needed and destroying it once it is empty. Thanks!
Grimless
A: 

The for-in construct is not free, it must be resolving to some kind of enumeration method call, so the times reported actually make sense. I would use plain C array in this case. Also you will get better performance if you use objc_msgsend() to call objc methods in such big loops.

fhj