views:

517

answers:

3

Hello dear Stackoverflow community!

Let's get straight to my question: I have a socket and all input coming via this socket / stream is parsed by my SAX parser. Now, upon a certain parsing event I'd like to close the socket / stream from within my SAX event handler. Also, I want to close the stream from outside in a certain case while the parser is still working. Unfortunately, I can't do the one thing or the other without having an exception thrown by the parser (unexpected document ending...). Ok, I could catch this exception, but do you know a solution how to securely close the stream?

+1  A: 

I don't think you can easily do this. You're giving the SAX parser a resource (a stream) to read from, and then you're closing it and the SAX parser still expects to read from it - hence it's (not unreasonably!) throwing an 'unexpected document ending'.

If you want to do this cleanly, I think your SAX parser handler that you've implemented should silently swallow events once you've decided to ignore further events.

e.g. your implementations of startElement(), endElement() etc. should perform a check to see whether you're still interested in these events before processing.

That way the SAX parser can run cleanly to the end of the document without you processing any more events.

Alternatively, why not record the fact that you've closed the input stream, and then when you get an 'unexpected document ending' event, see if it in fact was expected. And only log an error if it really was unexpected.

Brian Agnew
I like this, assuming the document ever does have an "end," since it's just data coming over a socket. Otherwise, just catch the exception.
Sam Hoice
A: 

If you control the document generating end, you could set up a close request message to send back to the server and have the incoming document ended. Depending on the details of your complete system, this is either an ugly hack or an elegant solution... :)

Sam Hoice
A: 

This may be obvious, but for use case like this, a Stax parser might be a better fit. Since application controls reading via iteration it can close the parser and the underlying stream at any given point. With SAX you will have to throw an exception, which is not particularly elegant or efficient. Plus you can only do that from within handler.

For extra points, StaxMate can make use of Stax more pleasant; without it Stax has similar low level of abstraction as SAX.

Finally: if your problem is that of blocking due to sockets, it may be hard to solve with traditional blocking-IO based xml parsers. There is one open source xml parser that can do non-blocking (async) parsing, but that's rather little known so I'll leave that discovery to interested readers. :-)

StaxMan