views:

229

answers:

3

I've been trying to diagnose a memory leak in an Android application I'm writing. I got a heap dump loaded into Eclipse, but the results I'm seeing are very curious. There are some 20,000 instances of an exception (specifically, LDAPException from the UnboundID LDAP library) in the heap with no inbound references.

That is, they show up at the root of the dominator tree. The OQL SELECT objects e FROM com.unboundid.ldap.sdk.LDAPException e WHERE (inbounds(e).length = 0) returns over 20,000 results, totalling to nearly all of the heap. And yet, the GC runs before the heap dump and I can see that it's running in the console, repeatedly, during the execution of the leaky code. If these instances have no inbound refs, what could be keeping them alive?

I also tried doing a "shortest paths to GC" query. It shows one LDAPConnectionReader row retaining 2 instances, and ~20k LDAPException @ <addr> unknown rows with various hex addresses.

Update: I haven't had time to further diagnose this since posting it, and the bounty I posted is ending before I likely will. I'm awarding it as best I can now, lest the points go to waste. Thanks to everyone who looked into this! I will come back later and update again with the results of further diagnosis, when life is a little less hectic.

+1  A: 

If you are using Eclipse, you can add a breakpoint on the LDAPException. Here you can find a tutorial on how to set one: Eclipse Tip: Breakpoint on Exception.

These breakpoints pause the execution whenever an exception of the selected type is thrown. Once you find out the conditions that throw so much exceptions, you can fix the bug.

It's not exactly debugging why unreferenced Exceptions are filling the heap, but I hope it can help.

Tomas Narros
Would whomever downvoted this answer actually CARE to explain why the downvote?
vladr
+2  A: 

Whether or not these Exceptions are being thrown, in terms of memory usage, that detail is pretty much irrelevant.

While you'd like to see in the heap dump who holds the references, for some reason, you're not able to get to this. I wonder if native code would get symbolized properly in the heap dump tool?

Either way, as something new to try, I'd suggest not debugging the point at which these Exceptions are thrown but where they are created. Put breakpoints on the class and/or all of its constructors. Ideally, you'd just get this information from the heap dump references, but it still may prove to be informative if you can see who is repeatedly constructing these objects... I'm guessing they come from the same place.

RonU
That's an interesting approach... but the exceptions are being created in a black box, aren't they?
Robert Karl
I'm not terribly familiar with the library in question, but a quick Googling makes me think it's open-source. Even if it isn't, you should be able to look at the current callstack when the debugger stops on the breakpoint to determine the class that's creating these items.
RonU
+1  A: 

I'm not familiar with OQL, or the Android platform in particular, or the inner workings of Java GC on that platform, but most obvious to me is the missing LDAPException metadata. It's got error codes, messages, methods, etc... where is it? was it uninitialized? are you prevented from posting all that stuff? Something like a server redirecting to itself might make me say "oh, that's weird, but it kinda made sense."

Have you tried swapping this lib for the JDK one? Looks like that should be easy if it's possible.

Then I would start squeezing the heap for everything it's got. GC characteristics could provide clues. Are there instances that escape collection somehow? How many are created per second? What fraction of stale ones are gc'd each pass, or is it a constant amount? Are they being created in a busy loop like Danny was talking about? What if you call System.gc() in a busy loop?

But yeah, that's where I start print debugging. Hopefully there's a better solution. :-P

Robert Karl
Most of the exceptions seem to have a cause of a socket timeout exception attached; not sure why because the library is *also* returning valid LDAP data at the same time. I can't use the JDK version because it is stripped out of Android.
Walter Mundt