tags:

views:

277

answers:

2

I'd like to parse large XML files and read in a complete node at a time from Java. The files are to large to put in a tree. I'd like to use a pull parser if possible since it appears to be easier to program for. Given the following XML data
Instead of having to check every event while using the StAX parser I'd like each call to hasNext or some similar function to return an object containing the complete info on a record node. When using Perl XML::LibXML::Reader allows me to do this using it's read method so I'm looking for an equivalent in Java.

+2  A: 

Commons Digester is really good for this type of problem. It allows you to configure parsing rules whereby when the parser encounters certain tags it performs some action (e.g. calls a factory method to create an object). You don't have to write any parsing code, making development fast and lightweight.

Below is a simple example pattern you could use:

<pattern value="myConfigFile/foos/foo">
    <factory-create-rule classname="FooFactory"/>
    <set-next-rule methodname="processFoo" paramtype="com.foo.Foo"/>
</pattern>

When the parser encounters the "foo" tag it will call createObject(Attributes) on FooFactory, which will create a Foo object. The parser will then call processFoo on the object at the top of the Digester stack (you would typically push this onto the stack before commencing parsing). You could therefore implement processFoo to either add these objects to a collection, or if your file is too big simply process each object as it arrives and then throw it away.

Adamski
+1  A: 

Try XML Pull Parser

Tom
I may use it, but it doesn't look like it's actively maintained.
Jared