Can I include characters such as "ã" and "ê" in UTF-8 encoded XML, or must it be UTF-16 encoded?
views:
228answers:
2
+1
Q:
Can I include characters such as "ã" and "ê" in UTF-8 encoded XML, or must it be UTF-16 encoded?
+6
A:
You can encode those characters in UTF-8.
The key is to keep the prolog (<?xml version="1.0" encoding="utf-8" ?>
) and the actual file encoding consistent.
The whole point of UTF-8 is to be able to encode all the Unicode characters in a smaller footprint. According to the source of all wisdom, wikipedia, utf-8 encodes each character point in 1 to 4 bytes, and is backwards compatible with ASCII
John Weldon
2010-02-05 18:49:40
The "smaller footprint" doesn't apply to all characters: U+0800 to U+FFFF are larger in UTF-8 than in UTF-16.
dan04
2010-06-12 00:02:02
+5
A:
All Unicode Transformation Format encodings can encode any character found in Unicode. The characters given are found in the Unicode standard.
Ignacio Vazquez-Abrams
2010-02-05 18:51:34