views:

6075

answers:

7

I am working on an Adobe Flex app, which needs to parse a relativley large XML file. ATM it is only 35MB, but in an ideal world would get much larger in the future. **Edit: I have no control over the XML file

I am essentially dropping it's contents right into an SQLITE database, so I could use the SimpleXML class to turn it into an object and then iterate through it, but I am worried that this would be a bad approach as the file gets larger. Am I being paranoid, or is there a better way of doing this?

+1  A: 

Personally I would try to avoid XML files of this size. It's one of the cons of XML, you need to read the whole file before you can start using it.

Can't you have your database return smaller XML portions, with what you need, instead of everything at once? It might be a bit more difficult to set up but it the end might prove much more scalable.

Luke
+3  A: 

What you want is a SAX XML parser - it can parse a stream without reading the whole thing. I can't find one for AS3, though (although there are other people looking for the same thing.)

SAX operates by raising events as elements traverse the input stream. Pretty handy - I've used it often in the past, and once you're familiar with it, it's useful for a lot of cases. It even works with an open socket where the stream never closes.

le dorfier
Good point. But I'm even having trouble handling big XML files in AIR! http://stackoverflow.com/questions/1159154/not-even-massive-xml-doc-manipulation-in-air
Yar
+4  A: 

You will definitely run into some performance issues parsing an XML file that large. Back in Flex 2 days we used SOAP for services and had one data call that pulled back about 5K records and the Flash Player would hang / browser go unresponsive for about 10 seconds on a reasonably fast machine. I can't remember the size of that SOAP message but it couldn't have been more than 1-2 MB.

If it's possible for your backend to transform the XML into an object graph and send it back over AMF you will see much better performance. Flash Player does really well with large datasets provided they're encoded in AMF (condensed binary format).

Even stil, I'd really consider whether you want to send a single result that large of break it up into pieces. At least that way you have a path for better scaling and can give the user some better feedback, i.e. displaying a message such as "Processing Item 6 of 35..."

cliff.meyers
Thanks for this, I'm wrestling with the same problem, and the answer might be to talk to a ruby backend or something... this is getting annoying. http://stackoverflow.com/questions/1159154/not-even-massive-xml-doc-manipulation-in-air
Yar
+1  A: 

As others have said, parsing large amounts of XML is not suggested and can get quite sluggish. It has been stated that the fastest way to send data between the flash client and a server side script is with AMF (Action Message Format) which is binary. If you have ever done anything with the SharedObject class then you have already had some dealing with AMF, as this is the format it writes the LSO to your hard drive as. AMFPHP was the best solution for this until recently as it now lends itself to the Zend framework, more specifically now it is ZendAMF.

There is an excellent tutorial here, by Lee Brimelow, one of the flash developers I look to for inspiration and clarity, that shows how to use ZendAMF.

The rate at which your data is available with ZendAMF, compared to plain old XML is staggering, and the larger the data to be parsed, the more noticeable.

Brian Hodge
http://hodgedev.com

Brian Hodge
+1  A: 

As already mentioned, a SAX parser would be your best bet so that you can process each "event" (node) as it is read, rather than using a DOM parser to read the entire XML file and store it in memory.

But if you're going to be using such large datasets then perhaps you could consider exporting your SQLite data in JSON format, rather than XML?

I'm not sure exactly how to export SQLite directly to JSON (without writing your own script to do it), however a message on the sqlite-users mailing list suggests trying the following unsupported / undocumented source code: http://www.ch-werner.de/sqliteodbc/sqlite3json.tgz

A tutorial on using JSON in Flex can be found at http://www.mikechambers.com/blog/2006/03/28/tutorial-using-json-with-flex-2-and-actionscript-3/

I started using JSON instead of XML hoping to avoid these problems, and it is the same... When the file is bigger than 50 MB, it does not work.
miguelSantirso
A: 

In SQL, there is always a WHERE clause because no one ever wants to see more than 100 results.

You might not have control over the original XML file, but perhaps you can insert something on the server-side that does the parsing and extracting the data you actually want.

Cheers

Richard Haven
A: 

Over the comparison on Speed between XML and JSON, actually I did gone through a comparison between 3 kinds for Speed for the heavy large data - XML, JSON and BlazeDS. Believe me, BlazeDS would be faster than anything. Its really faster.

Santanu.K