I had a brief look at org.json.XML.toJSONObject(String)
and it doesn't appear to be doing any character transcoding.
I suspect that the problem is in how your application is reading the String that is then being passed to toJSONObject
. I suspect it is using the wrong character set.
There are actually two possibilities:
The XML has no 'encoding' attribute and your application is just choosing the wrong one.
The XML does have an 'encoding' attribute, but your application is unable to respect it.
The second possibility is problematical. In an ideal world, an XML document be parsed by reading as ASCII bytes until the 'encoding' attribute in the <? xml ?> declaration is read. Then character interpretation switches to the document's specified encoding. But the XML parser use by
org.json` is not capable of doing this, and its API doesn't allow this anyway. So if you have XML with an 'encoding' attribute, you'll have to detect it (by some means) before you turn the document into a Java String.