tags:

views:

588

answers:

5

Hello everyone,

I wrote the following file in Visual Studio 2008 as a new XML file, and it reports the following error. What is the error message about and why it is treated as a wrong format XML file?

Here is the XML file and related error message.

<?xml version="1.0" encoding="utf-8"?>
<Foo>&#x2;</Foo>

Error   1 Character ' ', hexadecimal value 0x2 is illegal in XML documents. XMLFile1.xml 2 6 Miscellaneous Files

thanks in avdance, George

+2  A: 

0x2 is not a printable character.

dommer
@dommer, are valid in my sample, except x2?
George2
Yes. e would be OK, for example.
dommer
Do you have any documents to prove this rule -- XML file must include printable character? I am interested to learn more.
George2
+7  A: 

Your problem is the reference to &#x02, which essentially is random binary data that can not be printed. This is not allowed in XML1.0 (it is in XML 1.1 and higher, but it's not certain that your .Net version will allow it even if you change XML versions).

Banang
@Banang, 1. "binary reference" -- interested to learn what is a "binary reference", any recommended readings? 2. XML could only contain printable character in XML 1.0? Do you have any documents to prove this rule? I am interested to learn more.
George2
@George: "binary reference" was a typo, which was fixed in edit. Ment to say it was a reference to binary data. ;) I would suggest looking at http://www.xml.com/pub/a/2003/02/26/binaryxml.html For an historic read on xml and binary, I would suggest: http://www.xml.com/pub/a/98/07/binary/binary.html
Banang
Compare XML 1.1 http://www.w3.org/TR/REC-xml/#charsets with http://www.w3.org/TR/xml11/#charsets Note that Char is defined slightly differently.
Joachim Sauer
looks like #x2 is not in the range?
George2
George2:  represents U+0002 in XML and is outside the range of allowed characters in XML 1.0. It is allowed in XML 1.1.
Joachim Sauer
George2
@Banang, your document describes how to represent binary data in XML. But in my sample, why my input will be treated as binary data? What is the rule for XML parser to treat something as binary data? Thanks.
George2
+2  A: 

If you need to put binary data inside XML, use the CDATA section. http://www.w3schools.com/XML/xml_cdata.asp

daanish.rumani
Do you have any documents to prove this rule -- XML file must include printable character? I am interested to learn more.
George2
There is no rule as such, but certainly if you want to view the file in Internet Explorer or the like, you need to have characters that can be printed. Remember that there are certain characters like the angle brackets('<' and '>') that have special meaning in an XML file.
daanish.rumani
"Remember that there are certain characters like the angle brackets('<' and '>') that have special meaning in an XML file." -- is it related to my question?
George2
Suppose that your binary data you want to include contains angle brackets. Now you would have to escape each and every angle bracket with > or <The other way is to encapsulate the binary data into a CDATA section and keep the XML parser at bay.
daanish.rumani
"Suppose that your binary data you want to include contains angle brackets" -- confused. From where do you have this conclusion? :-)
George2
Character 0x02 is not valid, even in a CDATASection. CDATA sections are only for hand-authoring convenience, they don't allow any greater range of characters.
bobince
@bobince, "Character 0x02 is not valid, even in a CDATASection." -- do you have any documents to prove? I think CDATA mean unparsed data, it could contain anything. :-)
George2
+2  A: 

Check out the XML 1.0 specification

In particular, see the definition of Characters in section 2.2:

Char ::= #x9 | 
         #xA |
         #xD |
         [#x20-#xD7FF] |
         [#xE000-#xFFFD] |
         [#x10000-#x10FFFF]

And the definition of entity references in section 4.1:

Characters referred to using character references must match the production for Char.

toolkit
Thanks toolkit, looks like #x2 is not in the range?
George2
+4  A: 

I wrote the following file in Visual Studio 2008 as a new XML file, and it reports the following error. What is the error message about and why it is treated as a wrong format XML file?

According to the W3C XML 1.0 Specification, the only characters allowed in an XML document that are below &#x20; are the tab (09), newline (0A) and carriage return (0D).

XML 1.1 allows almost all characters, excluding 00, but is very rarely implemented and one should not rely on finding an XML 1.1 implementation.

Even in the XML 1.1 Spec. it is said that the use of the now allowed characters below &#x20; "is strongly discouraged".

Dimitre Novatchev
George2
@George2 One can write any Unicode character as a "character reference", that is in the form: {number}; Without the starting ampersand this will be just some string and will not be interpreted by the XML parser as a character reference.
Dimitre Novatchev