How do I parse an xml document as a stream using Scala? I've used the Stax API in java to accomplish this, but I'd like to know if there is a "scala" way to do this.
+1
A:
scala.xml.XML.loadFile(fileName: String)
scala.xml.XML.load(url: URL)
scala.xml.XML.load(reader: Reader)
scala.xml.XML.load(stream: InputStream)
There are others...
Randall Schulz
2010-05-25 21:40:59
Won't that load the entire document into memory? I'd like to parse it and handle it as I receive it and not store the entire document in memory first.
ScArcher2
2010-05-25 22:01:04
@ScArcher2: Yes, those are not streaming APIs.
Randall Schulz
2010-05-26 04:33:53
+11
A:
Use package scala.xml.pull. Snippet taken from the Scaladoc for Scala 2.8:
import scala.xml.pull._
import scala.io.Source
object reader {
val src = Source.fromString("<hello><world/></hello>")
val er = new XMLEventReader(src)
def main(args: Array[String]) {
while (er.hasNext)
Console.println(er.next)
}
}
You can call toIterator or toStream on er to get a true Iterator or Stream.
And here's the 2.7 version, which is slightly different. However, testing it seems to indicate it doesn't detect the end of the stream, unlike in Scala 2.8.
import scala.xml.pull._
import scala.io.Source
object reader {
val src = Source.fromString("<hello><world/></hello>")
val er = new XMLEventReader().initialize(src)
def main(args: Array[String]) {
while (er.hasNext)
Console.println(er.next)
}
}
Daniel
2010-05-25 23:10:01
I had to change my code to val er = new XMLEventReader().initialize(src), but it looks like it's working. Thanks!
ScArcher2
2010-05-26 21:03:30
the 2.7 xml.pull is savagely borked and should be avoided entirely by all but the very brave. 2.8 situation is much improved.
Seth Tisue
2010-05-29 00:17:51