views:

637

answers:

2

I have a java program that parses an XML document using xerces API.

My parsing class extends org.apache.xerces.parsers.XMLDocumentParser, overloading the startElement, endElement, characters methods.

Since it's a complicated XML document that is written by hand (mainly some kind of configuration elements), classic validation by xsd or dtd is not enough, and I have to return to the user that the XML document is not valid.

But 1 thing I could not achieve is to add the information in the error messages about the line number (and why not column number too) that is currenlty being parsed and where the error occurs.

I thing this can be possible, because Exceptions (org.apache.xerces.xni.parser.XMLParseException) generated by the parser when the XML document is not XML valid contain these informations.

+2  A: 

Not sure what the "right" way would be but looking at the API, assuming you provide XMLInputSource that takes an InputStream or a Reader you could provide in an InputStream/Reader that is wrapped with a LineNumberInputStream or LineNumberReader and then query it for the line number.

eg:

InputStream stream;

stream = ...;

new XMLInputSource(stream);

would become:

InputStream stream;
LineNumberInputStream lineStream;

stream = ...;
lineStream = new LineNumberInputStream(lineStream);

new XMLInputSource(lineStream);

// can now ask the line stream what line it is on via getLineNumber()

I am guessing you would also need to pass the LineNumberInputStream/LineNumberReader to your class that extends XMLDocumentParser.

Not sure if all of that is doable in your code.

Alternatively dig into the source and find out how they do it. If the variables/methods you need to access are private, and you are not worried about your code breaking in the future, you could use reflection and remove the access permissions to get at it.

TofuBeer
your solution was one i had in mind if this was not possible by the API. I'm glad i don't have to do this :-), but thank you
chburd
+2  A: 

I've never tried this with xerces, but SAX parsers can store a SAX Locator, from which you can get the line and column numbers as the document is parsed (or after an exception).

It looks like XMLDocumentParser may be able to do the same thing. Its parent class, AbstractXMLDocumentParser, has a startDocument method which is passed an XMLLocator parameter. If you override this method, you can save the XMLLocator and use its getLineNumber and getColumnNumber methods.

Jason Day
overriding the startDocument method is the way to go and work perfectly in my case, thank you
chburd