tags:

views:

312

answers:

2

I'm sure it should be obvious, but I could find any references on my question. What underlying technology does Scala XML uses? Is it something DOM-like or SAX-like or StAX like? What performance penalties should I be aware of when processing large documents? Is StAX still more efficient?

Thanks in advance.

+5  A: 

Large documents (several hundred of MB) can be processed with scala.xml.pull.XMLEventReader. See nightly scaladoc (assuming you'll be using 2.8). This is using a pull parser model like StAX.

In general compared to Java, Scala is doing its own thing when dealing with XML. The XML is immutable. Also you can use XML literals directly in your Scala code which tends to make the code more readable.

In response to the comment, XML.load uses javax.xml.parsers.{ SAXParser, SAXParserFactory } as underlying technology. I also assume that the resulting xml is loaded in memory.

huynhjl
Thank you. I forgot to mention that my question is regarded to parsing only (the `XML.load()` method, to be precise, and corresponding `NodeSeq` stuff) -- I suppose that the whole document is loaded and parsed if I use this API?
incarnate
Thanks again, your addition is an answer I was looking for.
incarnate
+1  A: 

Scala does it's own thing. Most of the XML models out there are mutable, and do not translate well to immutability (because they keep track of parent, mostly).

Here's a paper about it.

Daniel