I need to read a large XML document from the network and split it up into smaller XML documents. In particular the stream I read from the network looks something like this:
<a>
<b>
...
</b>
<b>
...
</b>
<b>
...
</b>
<b>
...
</b>
....
</a>
I need to break this up into chunks of
<a> <b> ... </b> <a>
(I only actually need the <b> .... </b>
parts as long as the namespace bindings declared higher up (e.g. in <a>
) are moved to <b>
if that makes it easier).
The file is too big for a DOM style parser, it has to be done streaming. Is there any XML library that can do this?
[Edit]
I think what I'm ideally looking for is something like the ability to do XPath queries on an XML stream where the stream parser only parses as far as necessary to return the next item in the result node set (and all its attributes and children). Doesn't have to be XPath, but something along the idea.
Thanks!