views:

231

answers:

4

So i'm making small program and it download ziped XML database file that is ~30 MB size (unziped). As I understand there is only way with such big files on iPhone, it's to use NSXMLParser. But that file is encoded with windows-1257 format and NSXMLParser does not eat files like that. What can I do? Is there a way to change file encoding on iphone or make NSXMLParser work with other then UTF8 encoded files?

A: 

Odds are you'll have to ask the data providers to provide the XML in UTF-8 format, as per the mantra of text encodings:

Use UTF-8. Always.

Williham Totland
A: 

NSXMLParser can also take input from an NSData object, so in some cases you can use NSString methods to read the file in the specified encoding and produce an NSData in UTF-8.

Something like:

NSString *str = [NSString stringWithContentsOfFile:myFilePath 
                 encoding:NSWindowsCP1252StringEncoding error:myError];
NSData *XMLData = [str dataUsingEncoding:NSUTF8StringEncoding];
NSXMLParser *parser = [[NSXMLParser alloc] initWithData:XMLData];

BUT one problem: it doesn't appear that windows-1257 is one of the encodings that NSString knows about, so you may be back to "tell the provider to use UTF-8", unless you want to do the mapping yourself (yuck).

David Gelhar
Well, Windows-1257 is yuck to begin with, but that's besides the point. It bears mentioning that in this multinational, multilingual world of ours, using any 256-restricted character encoding runs a steadily more real risk of causing irreversible data loss, hair loss, impotence, cancers of the pancreas and oral cavity, skin accesses, melanomas and mild coughs.
Williham Totland
A: 

NSXMLParser is not good at handling large files, since it loads all in memory. Instead you should consider basing your implementation on libxml2 which is able to parse the document in small chunks making it both faster and more memory efficient.

There is an excelent example available that shows how this can be implemented:

XMLPerformance example

libxml2 can be compiled with support for a lot of different encodings as outlined in the documentation. I have however not tested if Windows-1257 is supported by default on the iPhone.

Claus

Claus Broch
A: 

If you are are really stuck with windows 1257, do the mapping yourself. It is not that hard. This page tells you the unicode character codes for windows 1257 codes. http://msdn.microsoft.com/fr-fr/goglobal/cc305170%28en-us%29.aspx

You could even hack your zip library to perform the encoding conversion during decompression.

FenchKiss Dev