ansaurus

Question

Using Android SAXParser, one my my XML Elements is mysteriously breaking in half.

Answer 1

+1 A:

It is legitimate for the characters method to fire multiple times between startElement and endElement in a SAXParser. If your implementation isn't handling it, most likely the ContentHandler being used has in incorrectly coded characters method.

From the code snippet, I think the misbehaving characters method is elsewhere in your code, as you're passing 'this' as ContentHandler. Post that code, and maybe we can help fix it.

See the Javadoc, noting the phrase

SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks

This Javadoc is for ContentHandler. It appears you're using DocumentHandler, which has been deprecated in favor of ContentHandler. But the javadoc for DocumentHandler contains identical language.

Don Roby 2010-03-30 23:39:18

Thanks donrobyConsidering that the code only produces poor results when the StringReader and InputSource objects are used, to me it would appear the problems lies in there. When I bypass this implementation, its processes correctly, albeit unsatisfactorily for production. Consider also that regardless of the sort order used on the XML data, the problem occurs 2001 characters into the XML. Thanks!

FauxReal 2010-03-31 00:40:52

When you implement things incorrectly, sometimes they work in spite of your error. The problem lies in your code regardless of the fact that it sometimes seems to work.

Don Roby 2010-03-31 13:24:19

Answer 2

+3 A:

As donroby said it's perfectly legitimate for the parser to call the characters method more than once between startElement and endElement. However that isn't "misbehaving" at all and you shouldn't try to finagle things so that it doesn't happen. Your parser seems to be using a 2000-character buffer, but there are other reasons it might break a text node into parts.

What you should do is to accumulate data in the characters method and process it later, in the endElement method when you are sure you have accumulated all of the character data for the node.

Paul Clapham 2010-03-31 04:12:11

+1. Yes, the usual handling is to create or attach an accumulator of some sort in the startElement method, accumulate into it in the characters method, and then to use and dispose or detach it in the endElement method.

Don Roby 2010-03-31 13:27:32

Answer 3

A:

Thank you both so much for your responses. With your help I was able to solve the problem.

I was doing the actual processing inside the "characters" method, which is what I learned from an online tutorial.

By moving the processing to the endElement method, I was able to simply concatenate chars together into a string regardless of how many times 'characters' fired.

I accomplished this rather simply by setting up a boolean betweenTags and turning this true during startElement and false at the end of endElement.

Inside characters, I've added

if (betweenTags) accumulation += chars;

The accumulation string is set to "" at the end of startElement.

Works great now, no broken elements.

THANKS!

FauxReal 2010-03-31 16:12:07

You're welcome! If you now accept an answer it'll improve someone's reputation and your acceptance ratio.

Don Roby 2010-03-31 19:57:59

oh! Okay thanks!

FauxReal 2010-04-01 21:45:56

ansaurus

tags:

views:

answers:

Using Android SAXParser, one my my XML Elements is mysteriously breaking in half.

related questions