ansaurus

Question

handling carriage return in canonicalization with java

Answer 1

A:

XML defines that the input can contain all possible kinds of EOL styles but that the parser must replace all of them with a single linefeed (\n, ASCII 10) character.

If you want to protect the character, you must replace ASCII 13 with  yourself before the XML parser sees the input. If you use Java, I suggest to use a FilterInputStream.

Aaron Digulla 2010-08-10 13:10:07

does that mean it is wrong to expect replacing cr for canonicalization in this case?

artsince 2010-08-10 13:18:51

Not only in this case; XML always swallows it even before the text nodes are created.

Aaron Digulla 2010-08-10 13:24:37

I'm afraid I had misguided artsince earlier on the way that c14n is meant to preserve explicit but normalise on U+000D characters, as this had made a difficulty with the equivalent .NET code appear correct to me when it was not. In .NET one wants to do the normalisation prior to loading the XmlDocument to have the correct cases of preserved, as otherwise they won't be distinguished from explicit cases. Easily done, but the fact that often is correct in c14n output had misled me.

Jon Hanna 2010-08-10 14:49:24

ansaurus

tags:

views:

answers:

handling carriage return in canonicalization with java

related questions