I'm currently trying to read in an XML file, make some minor changes (alter the value of some attributes), and write it back out again.

I have intended to use a StAX parser (javax.xml.stream.XMLStreamReader) to read in each event, see if it was one I wanted to change, and then pass it straight on to the StAX writer (javax.xml.stream.XMLStreamReader) if no changes were required.

Unfortunately, that doesn't look to be so simple - The writer has no way to take an event type and a parser object, only methods like writeAttribute and writeStartElement. Obviously I could write a big switch statement with a case for every possible type of element which can occur in an XML document, and just write it back out again, but it seems like a lot of trouble for something which seems like it should be simple.

Is there something I'm missing that makes it easy to write out a very similar XML document to the one you read in with StAX?

+1  A: 

After a bit of mucking around, the answer seems to be to use the Event reader/writer versions rather than the Stream versions.

(i.e. javax.xml.stream.XMLEventReader and javax.xml.stream.XMLEventWriter)

See also http://www.devx.com/tips/Tip/37795, which is what finally got me moving.

Matt Sheppard
+1  A: 

StAX works pretty well and is very fast. I used it in a project to parse XML files which are up to 20MB. I don't have a thorough analysis, but it was definitely faster than SAX.

As for your question: The difference between streaming and event-handling, AFAIK is control. With the streaming API you can walk through your document step by step and get the contents you want. Whereas the event-based API you can only handle what you are interested in.


I know this is rather old question, but if anyone else is looking for something like this, there is another alternative: Woodstox Stax2 extension API has method:

XMLStreamWriter2.copyEventFromReader(XMLStreamReader2 r, boolean preserveEventData) 

which copies the currently pointed-to event from stream reader using stream writer. This is not only simple but very efficient. I have used it for similar modifications with success.

(how to get XMLStreamWriter2 etc? All Woodstox-provided instances implement these extended versions -- plus there are wrappers in case someone wants to use "basic" Stax variants, as well)