views:

292

answers:

2

Hello,

I just spent a whole week tracking down and whacking memory leaks over the head, and I arrived on the other end of that week a little dazed. There has to be a better way to do this, is all I can think, and so I figured it was time to ask about this rather heavy subject.

This post turned out to be rather huge. Apologies for that, though I think in this case, explaining the details as thoroughly as possible is warranted. Explicitly so, because it gives you the whole picture of all the things I did to find this bugger, which was a lot. This bug alone took me roughly three 10+ hour days to track down...

When I hunt leaks

When I hunt leaks I tend to do it in phases, where I escalate "deeper" into the problem if it's not solvable in an earlier phase. These phases begin with Leaks telling me there's an issue.

In this particular case (which is an example; the bug is solved; I'm not asking for answers to solving this bug, I'm asking for ways to improve the process in which I find the bug), I am finding a leak (two, even) in a multithreaded application which is fairly large, especially including the 3 or so external libraries I'm using in it (unzip feature and http server). So let's see the process where I fix this leak.

Phase 1: Leaks tells me there's a leak

Leaks with 2 GeneralBlock-160 leaks at 160 bytes in Foundation's NSPushAutoreleasePool

Well, that's interesting. Since my app is multithreaded, my first thought is that I forgot to put an NSAutoreleasePool in somewhere, but after checking in all the right places, this is not the case. I take a look at the stack trace.

Phase 2: The stack trace

The stack trace for the leak

Both of the GeneralBlock-160 leaks have identical stack traces (which is odd since I have it grouped by "identical backtraces", but anyway), which start at thread_assign_default and end at malloc under _NSAPDataCreate. In between, there is absolutely nothing that correlates to my app. Not a single of those calls are "mine". So I do some Googling around to figure out what these might be used for.

First we have a number of methods which obviously have to do with a thread callback, such as POSIX thread calls going into NSThread calls.

At #8-6 in this (inverted) stack trace, we have +[NSThread exit] followed by pthread_exit and _pthread_exit which is interesting, but in my experience I can't really tell if it's indicative of some specific case or if it's simply "how things go".

After that we have a thread cleanup method called _pthread_tsd_cleanup -- whatever "tsd" stands for I'm not sure, but regardless, I move on.

At #4-#3 we have:

CA::Transaction::release_thread(void*)
CAPushAutoreleasePool

Interesting. We have Core Animation here. That, I've learned the very hard way, means that I'm probably doing UIKit calls from a background thread, which I must not. The big question is where, and how. While it may be easy to say "thou shalt not call UIKit from ye olde background thread", it's not as easy to know what exactly constitutes as a UIKit call. As you'll see in this case, it's far from obvious.

Then #2-1 turn out to be way too low level to be of any real use. I think.

I still have no clue where to even begin looking for this memory leak. So I do the only thing I can think of.

Phase 3: return galore

Propose we have a call tree that looks something like this:

App start
    |
Some init
  |      \
A init   B init - Other case - Fourth case
   \     /              \
 Some case            Third case
     |
  Fifth case
   ...

Rough outline of an app's lifecycle, that. In short, we have a number of paths the app can take depending on whatever happens, and each of these paths comprise of a bunch of code being called in various places. So I pull out the scissors and start chopping. I start close towards "App start" initially, and slowly move down the line towards crossroads, where I only allow one path.

So I have

// ...
[fooClass doSomethingAwesome:withThisCoolThing];
// ...

And I do

// ...
return;
[fooClass doSomethingAwesome:withThisCoolThing];
// ...

And then install the app on the device, close it down, alt-tab to Instruments, hit cmd-R, hammer on the app like a monkey, look for leaks, and after maybe 10 "cycles" if there's nothing, I conclude that the leak is further down the code. Possibly in fooClass's doSomethingAwesome: or below the call to fooClass.

So I move that return one step below the call to fooClass and test again. If the leak doesn't appear now, great, fooClass is innocent.

There are a few issues with this method.

  1. Memory leaks tend to be a bit snobbish about when to reveal themselves. You need romantic music and candles, so to say, and cutting one end of in one place sometimes results in the memory leak deciding not to appear at all. I often had to go back because the leak had appeared after I added, say, this line: UIImage *a; (which obviously isn't leaking by itself)
  2. It's excruciatingly slow and tiring to do for a big program. Especially if you end up having to back up again.
  3. It's hard to keep track of. I kept putting in // 17 14.48.25: 3 leaks @ RSx10 which in English meant "July 17th, 14:48.25: 3 leaks occured when I repeatedly selected the item 10 times" sprinkled throughout the entire app. Messy, but at least it let me see clearly where I'd tested things and what the results were.

This method eventually took me down to the very bottom of a class which handled thumbnails. The class had two methods, one which initialized things and then did a [NSThread detachThreadWithSeparator:] call to a separate method which processed the actual images and put them into the individual views after scaling them down to the right size.

It was sort of like this:

// no leaks if I return here
[NSThread detachNewThreadSelector:@selector(loadThumbnails) toTarget:self withObject:nil];
// leaks appear if I return here

But if I went into -loadThumbnails and stepped down through it, the leaks would disappear and appear in a very random fashion. At one extensive run, I would have leaks and if I moved the return statement down below e.g. UIImage *small, *bloated; I would have leaks appearing. In short, it was very erratic.

After some more testing, I realized that leaks would tend to appear more often if I reloaded things quicker while in the app. After many hours of pain, I realized that if this external thread did not finish executing before I loaded another session (thus creating a second thumbnail class and discarding this one), the leak would appear.

That's a nice clue. So I added a BOOL called worldExists which was set to NO as soon as a new session was initiated, and then started sprinkling -loadThumbnails's for loop with

if (worldExists) [action]
if (worldExists) [action 2]
// ...

and also made sure to exit the loop as soon as I found out that !worldExists. But the leak remained.

And the return method was showing leaks in very erratic places. Randomly, it appeared.

So I tried adding this at the very top of -loadThumbnails:

for (int i = 0; i < 50 && worldExists; i++) {
    [NSThread sleepForTimeInterval:0.1f];
}
return;

And believe it or not, but the leaks actually appeared if I loaded a new session within 5 seconds.

Finally, I put a breakpoint in -dealloc for the thumbnail class. The stack trace for this looked like this:

#0  -[Thumbs dealloc] (self=0x162ec0, _cmd=0x32299664) at /Users/me/Documents/myapp/Classes/Thumbs.m:28
#1  0x32c0571a in -[NSObject release] ()
#2  0x32b824d0 in __NSFinalizeThreadData ()
#3  0x30c3e598 in _pthread_tsd_cleanup ()
#4  0x30c3e2b2 in _pthread_exit ()
#5  0x30c3e216 in pthread_exit ()
#6  0x32b15ffe in +[NSThread exit] ()
#7  0x32b81d16 in __NSThread__main__ ()
#8  0x30c8f78c in _pthread_start ()
#9  0x30c85078 in thread_start ()

Well... that doesn't look too bad. If I wait until the -loadThumbnails method is finished, the trace looks different though:

#0  -[Thumbs dealloc] (self=0x194880, _cmd=0x32299664) at /Users/me/Documents/myapp/Classes/Thumbs.m:26
#1  0x32c0571a in -[NSObject release] ()
#2  0x00009556 in -[WorldLoader dealloc] (self=0x192ba0, _cmd=0x32299664) at /Users/me/Documents/myapp/Classes/WorldLoader.m:33
#3  0x32c0571a in -[NSObject release] ()
#4  0x000045b2 in -[WorldViewController setupWorldWithPath:] (self=0x11e9d0, _cmd=0x3fee0, path=0x4cb84) at /Users/me/Documents/myapp/Classes/WorldViewController.m:98
#5  0x32c29ffa in -[NSObject performSelector:withObject:] ()
#6  0x32b81ece in __NSThreadPerformPerform ()
#7  0x32c23c14 in CFRunLoopRunSpecific ()
#8  0x32c234e0 in CFRunLoopRunInMode ()
#9  0x30d620da in GSEventRunModal ()
#10 0x30d62186 in GSEventRun ()
#11 0x314d54c8 in -[UIApplication _run] ()
#12 0x314d39f2 in UIApplicationMain ()
#13 0x00002fd2 in main (argc=1, argv=0x2ffff5dc) at /Users/me/Documents/myapp/main.m:14

Quite different, in fact. At this point, I was still clueless, believe it or not, but I finally figured out what was going on.

The problem is the following: when I do [NSThread detachNewThreadSelector:] in the thumbnail loader, NSThread retains the object until the thread runs out. In the case where the thumbnail loading doesn't finish before I load another session, all of my retains on the thumbnail loader are released, but since the thread is still running, NSThread keeps it alive.

As soon as the thread returns from -loadThumbnails, NSThread releases it, it hits 0 retain and goes straight into -dealloc... while still in the background thread.

And when I then call [super dealloc], UIView obediently tries to remove itself from its superview, which is a UIKit call on a background thread. Consequently a leak occurs.

The solution I came up with to solve this was to wrap the loader in two other methods. I renamed it to -_loadThumbnails and then did the following:

[self retain]; // <-- added this before the detaching
[NSThread detachNewThreadSelector:@selector(loadThumbnails) toTarget:self withObject:nil];

// added these two new methods
- (void)doneLoadingThumbnails
{
    [self release];
}
-(void)loadThumbnails
{
    [self _loadThumbnails];
    [self performSelectorOnMainThread:@selector(doneLoadingThumbnails) withObject:nil waitUntilDone:NO];
}

All that said (and I said a lot -- sorry about that), the big question is: how do you figure these odd-ball things out without going through all of the above?

What reasoning did I miss in the above process? At what point did you realize where the problem was? What were the redundant steps in my method? Can I skip phase 3 (return galore) somehow, or cut it down, or make it more efficient?

I know this question is, well, vague and huge, but this whole concept is vague and huge. I'm not asking you to teach me how to find leaks (I can do that... it's just very, very painful), I'm asking what people tend to do to cut down on the process time. Asking people "how do you find leaks?" is impossible, because there are so many different kinds. But the one type I tend to have issues with is the one that looks like the above, with no calls inside your actual app.

What process do you use to more efficiently track it down?

+1  A: 

What reasoning did I miss in the above process?

Sharing UIView objects between multiple threads should have had very loud alarm bells going off in your head, pretty much as soon as you were writing the code.

JeremyP
Well, the reason in this case was simply because I was loading large-ish images in large numbers. I more or less have to do that in a background thread or the UI will be frozen for a long time. That and, I admit, I didn't realize when I started writing that it wasn't allowed, so I had some scattered UI calls that I fixed in the process.
Kalle
technically, the UIKit framework isn't threadsafe. I'm kind of surprised they don't have assertions in there that cause it to bail immediately when you try to access your view from a secondary thread.
Ben Gotow
Yeah, you'd think it'd be more obvious but I think [NSThread isMainThread] can be expensive so they'd rather you do it. For the record, while I was fixing the above bug I was 100% aware of this fact, and the error that I solved above has nothing to do with UIKit calls in the end, it has to do with deallocation whilst in a bg thread, which is very different.
Kalle
@Kalle: n your question you say: "And when I then call [super dealloc], UIView obediently tries to remove itself from its superview, which is a UIKit call on a background thread. Consequently a leak occurs." Removing a view from a super view is a UIKit call. You need to think of a better way of passing information between threads than passing around whole views.
JeremyP
@JeremyP: I'm not passing around views, really... I have a view which has thumbnails which it generates. If its generation ends after the session ends and a new session has begun, there's a new thumbnail view and the old one has been released. I in fact have two of these views (one for portrait, one for landscape) for each session. Reusing them might be an option, but at a glance it doesn't seem like that would help much. Aside from horrible dealloc issues. :)
Kalle
@Kalle: yes you are. How could a background thread cause a UIView to remove itself from its superview if the background thread had not been passed a pointer to the view - either directly or indirectly?
JeremyP
@JeremyP: I think we're kind of getting entangled in terminology here. If you are saying / did say "never run a background thread on a UIKit object using [NSThread detachThreadWithSelector:]", that'd be a different story though. The problem is that a lot of people are doing this. It's just uncommon for said UIKit object to become invalidated while the thread is running.
Kalle
@Kalle: yes I am saying that. As you have found out, it will bite you in the backside detaching threads with a UIView as the target. A better design would be to have a collection object that contains the thumbnails with a method that can be used by the background thread to load the images.
JeremyP
@JeremyP: yeah, that makes sense I guess. I just see lots of other examples where people do this very thing, so it didn't strike me as terribly bad when I put it in. On another note, I kind of meant for this question to be a bit more generic on "general (oddball) leak tracking practice" but I'm not getting any other responses so I'm gonna mark your answer. Thanks for the detailed responses. :)
Kalle
@Kalle thanks. If it helps, my answer *was* sort of trying to give you a generic answer. When I first started doing Objective-C I was getting leaks all over the place - or double deallocs. If you do enough O-C programming, however, you develop a sort of instinct for detecting the kinds of patterns that cause trouble.
JeremyP
@JeremyP: I hear ya! That week of anguish did give me an entirely new perspective on tracking these things down, so maybe it's not so much about finding a generic method as it is building up a healthy list of things you should never do.
Kalle
+1  A: 

In the future, you might consider taking a look at other memory leak hunting tools, like MallocDebug.

karlphillip
That's a nice little document. I'll have to play around with MallocDebug some and see how it differs from Leaks.
Kalle