tags:

views:

98

answers:

1

Preface: I'm working on docx parser for java. docx format is based on xml. When I read document its parts are being unmarshalled (with JAXB). And I get a tree of certain elements based on xml markup.

Almost problem: But some elements (which are at very deep xml level) returned not as certain class from docx spec (i.e. CTStyle, CTDrawing, CTInline etc) but as Object. Those objects are indeed instances of xerces classes, e.g. ElementNSImpl.

Problem: How should I handle objects from xerces (e.g. ElementNSImpl)? The simplest approach is:

CTGraphicData gData = getGraphicData ();
Object obj = gData.getAny().get(0);
ElementNSImpl element = (ElementNSImpl)obj;

But it doesn't seem to be a good solution. I've never worked with xerces directly. What is the better way to do this casting? (If anyone also give me a tip about right way to iterate through nodes it would be great).

+1  A: 

Because the XSD had an 'any', JAX-B is mapping that piece of XML to the DOM. You should be casting to 'Element', not 'ElementNSImpl'. Then you have to use the DOM API, possibly with the assistance of XPath, to pull the data.

If JAXB is giving you elements and you think that the schema has a specific type, not xs:any, then something's wrong with how you are configuring JAX-B.

xs:any in an XSD means 'anything'.

The element enables us to extend the XML document with elements not specified by the schema.

bmargulies
No, any is normal. It's `any` in spec.
Roman
Then your are in the DOM business.
bmargulies
Thanks for the answer, it'a at least a good start point. BTW about `any`: do you think I cannot 'hardcode' the path and must use XPath? I want to say: does 'any' means that there can be really anything?
Roman