views:

468

answers:

1

I'm trying to run an XSLT transformation, but characters like ëöï are replaced by a literal '?' in the output (I checked with an hex editor). The source file has the correct characters, and the stylesheet has:

<xsl:output encoding="UTF-8" indent="yes" method="xml"/>

What else am I missing?

I'm using saxon as the transformer, if that matters.

+2  A: 

The problem is most likely in the way you call the transformer. My guess is it will work fine if you call it using java -jar saxon.jar ...

In general, when you use XML tools which take InputStream/OutputStream, then the tools will make sure that the encoding is correct.

When you use a mixture of Streams and Writers, you will have to make sure that the encoding when going from one to the other matches what you told the XSLT processor to produce. Always set encodings explicitly. There may be defaults, but when it comes to encodings, they are wrong more often than not.

Bart Schuller
what's the correct way? I'm currently doing `TransformerFactory.newInstance().newTransformer()` and `transformer.transform(source,result)`.
Sietse
You're right, commandline works like it should.
Sietse
What is `result` and how does it get written to disk?
Bart Schuller
result = new StreamResult(new OutputStreamWriter(new FileOutputStream("/tmp/foo.xml"));
Sietse
Every time you use an API which goes from bytes to chars or the other way around, you should look for an encoding parameter. Any defaults will always be wrong.
Bart Schuller