How do I parse an xml document as a stream using Scala? I've used the Stax API in java to accomplish this, but I'd like to know if there is a "scala" way to do this.
+1
A:
scala.xml.XML.loadFile(fileName: String)
scala.xml.XML.load(url: URL)
scala.xml.XML.load(reader: Reader)
scala.xml.XML.load(stream: InputStream)
There are others...
Randall Schulz
2010-05-25 21:40:59
Won't that load the entire document into memory? I'd like to parse it and handle it as I receive it and not store the entire document in memory first.
ScArcher2
2010-05-25 22:01:04
@ScArcher2: Yes, those are not streaming APIs.
Randall Schulz
2010-05-26 04:33:53
+11
A:
Use package scala.xml.pull. Snippet taken from the Scaladoc for Scala 2.8:
import scala.xml.pull._
import scala.io.Source
object reader {
val src = Source.fromString("<hello><world/></hello>")
val er = new XMLEventReader(src)
def main(args: Array[String]) {
while (er.hasNext)
Console.println(er.next)
}
}
You can call toIterator
or toStream
on er
to get a true Iterator
or Stream
.
And here's the 2.7 version, which is slightly different. However, testing it seems to indicate it doesn't detect the end of the stream, unlike in Scala 2.8.
import scala.xml.pull._
import scala.io.Source
object reader {
val src = Source.fromString("<hello><world/></hello>")
val er = new XMLEventReader().initialize(src)
def main(args: Array[String]) {
while (er.hasNext)
Console.println(er.next)
}
}
Daniel
2010-05-25 23:10:01
I had to change my code to val er = new XMLEventReader().initialize(src), but it looks like it's working. Thanks!
ScArcher2
2010-05-26 21:03:30
the 2.7 xml.pull is savagely borked and should be avoided entirely by all but the very brave. 2.8 situation is much improved.
Seth Tisue
2010-05-29 00:17:51