views:

274

answers:

2

I have what is probably a really simple, studid question but I can't find an answer to it anywhere and I need to be pretty sure about this.

I have various XML files from various vendors. One of the vendors provide me an XML file with japanese characters in the file. Originally, I was having trouble processing the XML file (I'm using the MSXML SDK). The characters would come out wrong. I found that if the following was added to the XML file everything worked great.

<?xml version="1.0" encoding="UTF-16"?>

And so I asked the vendor to add this to their file. But they added it with the encoding in lower case:

<?xml version="1.0" encoding="utf-16"?>

And when I load this new file, with this declaration, I'm getting the same problem as when this declaration was not there.

What I'm trying to figure out (for sure) is if that encoding attribute is case sensitive (or is otherwise the problem). Does it matter that they put "utf-16" versus "UTF-16"?

Update: Under the advise of these who posted answers here, I setup and executed a test. One file had the lower case utf-16 and the other upper case. Other than that, the files were identical. This did not fix the problem and is not the problem. My conclusion is that MSXML is not case sensitive as the spec, posted in the answers, states.

+3  A: 

From the XML specs:

XML processors SHOULD match character encoding names in a case-insensitive way

So it's not needed but recommened to be case-insensitive, according to RFC 2119:

  1. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a
    particular item, but the full implications must be understood and
    carefully weighed before choosing a different course.
schnaader
Though your and JoshJordan's answers are about the same, I'm going to give the 'answer' to him because he pointed out that it may not be true in practice and that we should try a side-by-side. But, I do appropriate your leaving an appropriate answer and I'm going to +1 for that. Again, thank you.
Frank V
+2  A: 

I suppose the question is not really "is the standrard case-sensitive?" but "is the encoding case-sensitive in MXSML SDK?"

From bytes.com:

The XML spec says that processors "SHOULD" be match encoding names case-insensitively. "SHOULD" is a technical term, less strong than "MUST", but I can't see any reason why a processor would not do it.

However, we know that that may not always be true in practice. If you can try both side-by-side, please do so and let us know what the result is.

JoshJordan
I can and will. It just takes quite a bit of effort to set it up. I was hoping one had detailed knowledge of MSXML....
Frank V
I posted an update above. Thank you.
Frank V