tags:

views:

23

answers:

1

I am using XOM to canonicalize some XML. But there are some strange characters prepended to the output. The core of the code is as follows:

String result;
outputstream = new ObjectOutputStream(bytestream);
Builder builder = new Builder();
Canonicalizer canonicalizer = new Canonicalizer(outputstream, Canonicalizer.EXCLUSIVE_XML_CANONICALIZATION);
nu.xom.Document input = builder.build(xml, uri);
Node node = input.getRootElement();
String xpath = "//a:head";
XPathContext context = new XPathContext("a", "http://example.com/a");
Nodes nodes = node.query(xpath, context);
if (nodes.size() > 0) {
    canonicalizer.write(nodes.get(0));
    outputstream.close();
    result = bytestream.toString("UTF8");
}

xml contained

<a:envelope   xmlns:b='http://example.com/b'   xmlns:a="http://example.com/a"&gt;
  <a:document>
    <a:head>
      <b:this>this</b:this>
      <b:that>that</b:that>
      <b:what />
    </a:head>
    <a:body>
    </a:body>
  </a:document>
</a:envelope>

When the result is displayed in a JTextarea, there are six unexpected characters shown before the first <. The decimal values of the bytes in the bytestream are -84,-19,0,5,119,-36,60. (this is followed by canonical XML).

What am I doing wrong?