views:

48

answers:

2

Hi!
I've been doing some tests, and one of my needs is to read data from different xml files and stack it together on a single file. While I've managed to accomplish this, memory consumption seems to be quite large for the task, the iphone simulator didn't even raise the memory warning, but I don't think the real iPhone would tolerate this (I don't have a device to try it here, so I'm mostly speculating that from what I've read).
The (main part of the) code is like:

Boolean success = [fileManager createFileAtPath:documentsPath contents:nil attributes:nil];
[fileManager release];

if (success) {
    NSFileHandle *fileHandle = [NSFileHandle fileHandleForWritingAtPath:documentsPath];
    for (int i = 0; i < 100; i++) {

        NSString *path = [[NSBundle mainBundle] pathForResource:@"mensagem_de_arquivo" 
                                                         ofType:@"xml"];
        NSData *data = [NSData dataWithContentsOfFile:path];
        GDataXMLDocument *xml = [[GDataXMLDocument alloc] initWithData:data options:0 error:nil];
        NSArray *tokens = [xml nodesForXPath:@"//message/data" error:nil];
        if (tokens.count > 0) {
            GDataXMLElement *token = (GDataXMLElement *)[tokens objectAtIndex:0];
            [fileHandle writeData:[[token stringValue] dataUsingEncoding:NSASCIIStringEncoding]];
        }
        [xml release];
    }

Using the "Build and Analyze" command gives me no leak or anything, and the code doesn't raise warnings when building, but still, memory consumption goes somewhere between 50 and 70mb (just considering live bytes, overall it almost doubles).
The idea obviously isn't to read 100 times the same file, but as test data it more than suffices, since the code has to just read the contents from xml files and send them to a file in the order they are received.

Is there any way to force the release of some temporary objects before new ones are allocated, could I try to reuse some variables, any ideas that help me keep this under control are REALLY welcome.

Edit - just to make things a little more interesting: it'd be better to keep a single parser to read and write, and by that the best would be to stick with GDataXML or, if a change was needed, to use KissXML, TinyXML or libxml - DOM, which all seem to suck up a little more memory, as said here, so if there was a way to enforce the release of memory it would be the best.

Thanks in advance :)

A: 

Yeah, you've "doubled" it by reading it all into an NSData, then parsing into a DOM in GDataXMLDocument. If you are expecting to go through lots of XML data like this, looping over multiple files, etc., then you should consider SAX based parsing instead, and streaming it directly from the file rather than preloading it into an NSData. In that way, you won't have to "release temporary objects" because you'll only be extracting the information you need, as it's parsed.

jbm
Yeah, "doubled" is a great estimative, changing to SAX based parsing almost halved memory consumption, being able to accomplish the same using 38mb, but taking nearly 5 times more to finish. I'm keeping the file handle as a private variable on my delegate and writing to the file directly from "foundCharacters" method, is there any better way of doing this, considering I only need what's in a single node from my file?
wintermute
Is there any reason to keep the file or the XML around once you've found that node? I'd copy whatever information I needed out of the node, rather than keeping the node, then dispose of anything having to do with the XML file after that.
jbm
A: 

Actually, it was a quite simple resolution.

All I had to do was to instantiate a AutoReleasePool and drain it at the end of the loop.
Like this:

for (int i = 0; i < 100; i++) {
    NSAutoReleasePool *pool = [[NSAutoReleasePool alloc] init];
    //... do everything I've done before...
    [pool drain];
}

This forced to release objects marked as autorelease instantiated inside the for, which were just being released after it's end, as it was expected, without interfering in everything else, so no objects were released before they should.
Memory consumption dropped from a whooping 60 ~ 80mb to something like 1,6mb during the loop, and going back to the same 600kb after it (it was a dummy application that done just this).

I'll still leave this question open for a while, in case someone has a better idea, but for now, seems this will be the way :)

wintermute