views:

391

answers:

2

Hi everybody,

I have been using a SAX parser for a while now to get data from various XML, but today i'm banging my head on a new problem with a hudge XML (compared to the previous ones . here around 12k lines) with a lot of repetitive items in it. Most of the time, the items are part of a block :

  <content>

  <item lbl="blabla">
    <item lbl="blabla"/>
    <item lbl="blabla"/>
  </item>

  <item lbl="blabla">
    <item lbl="blabla"/>
    <item lbl="blabla"/>
    <item lbl="blabla"/>
    <item lbl="blabla"/>
    <item lbl="blabla"/>
    <item lbl="blabla"/>
  </item>
</content>

The blabla part is of course changing...But, I would like to keep the structure of items (they are titles and subtitles). And for that, I append each blabla with a starting and ending tag <itemx>blabla</itemx>, where x is the position in the tree of items (1, 2, 3 or 4). The slightly problematic part is that with that, I'm creating thousands of useless objects and the garbage collector doesn't have time to clean after the parser, and the inevitable OutOfMemory comes in my face... I have no idea of how to deal with it; The best technique would be if I could take the whole content of <content></content>, but i'm not sure that this is possible with a SAX parser.

Any help is welcome and any solution deeply thanked...

+1  A: 

If the data you're attempting to read exceeds the memory available, then you'll need to persist the data to free up memory to keep reading.

Have you considered storing your data in a sqlite database as you read it in?

You should also avoid creating tons of useless temporary objects, could you get away with mutating a single object or a small pool of objects to avoid garbage piling?

If you're looking to get the whole document tree in memory, then you should use a DOM parser (DocumentBuilder is available on Android for this.) However, if you're running out of memory using the SAX parser, it's quite likely the DOM parser will also run out, unless your SAX events are making and destroying tons of object instances.

Ben S
Indeed I'm storing the data in an SQLite database, but only at the end of the parsing...And again, yes I'm creating tons of objects because I'm trying to keep the tree of items...and unfortunately I see no other alternative to keep this...thanks for your answer, it confirms my dreads... :)
Sephy
You can always try to use `DocumentBuilder` since it's meant to hold the whole tree for you in a generic way.
Ben S
+1  A: 

For the most part, you can't "create objects fast enough that the GC can't keep up." In fact, when a GC needs to happen, your entire app is suspended until it completes, so you just can't get ahead of it.

The only exception to this are Bitmaps, which are handled a little specially -- they count against the Java heap, even though their allocations don't happen on it. This is fine, except a Bitmap's memory doesn't get freed until its finalizer runs, and finalizers do run separately from the garbage collected and do not block an app. So creating a bunch of bitmaps and simply letting go of them (without calling the method to explicitly release the Bitmap's memory) can indeed cause an out of memory exception.

But if you aren't allocating (and letting go of) Bitmap objects, you have some other problem, probably just... not having enough memory for all of your allocations. You can use the hat tool (and to a lesser extent the simple Java heap information in DDMS) to see what you have allocated that is using so much space.

hackbod