views:

330

answers:

3

I have this code

 Dim doc As XDocument = New XDocument( _
  New XDeclaration("1.0", "utf-8", "yes"), _
   New XElement("transaction", _
    New XElement("realm", wcRealm), _
    New XElement("password", wcPassword), _
    New XElement("confirmation_email", wcConfEmail), _
    New XElement("force_subscribe", wcSubscribe), _
    New XElement("optout", wcOptOut), _
    New XElement("command", _
     New XElement("type", wcType), _
     New XElement("list_id", wcListId), _
     From trans As DataRow In table.Rows _
     Order By trans("last") _
     Select New XElement("record", _
       New XElement("email", trans("email")), _
       New XElement("first", trans("first")), _
       New XElement("last", trans("last")), _
       New XElement("company", trans("company")), _
       New XElement("address_1", trans("address_1")), _
       New XElement("address_2", ""), _
       New XElement("city", trans("city")), _
       New XElement("state", trans("state")), _
       New XElement("zip", trans("zip")), _
       New XElement("country", trans("country")), _
       New XElement("phone", trans("phone")), _
       New XElement("fax", trans("fax")), _
       New XElement("custom_source", trans("source")), _
       New XElement("custom_vmail_expire_date", "")))))
        '' # Save XML document at root.
        doc.Save("c:\vj" & saveDate & ".xml")

which works fine a produces the proper XML file BUT I run it through a validator and get this error.

Sorry, I am unable to validate this document because on line 1 it contained one or more bytes that I cannot interpret as us-ascii (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.

The error was: ascii "\xEF" does not map to Unicode

What could be causing that?

+2  A: 

The problem is that you have an UTF-8 file that you are trying to validate as ASCII. Those 2 bytes are the unicode headers.

David
A: 

The validator doesn't support UTF8/UCS-2. Either save the file as ascii (which will break, as the xml says it's utf-8) or find a validator that was created within the last 5 years.

EDIT:

Note: If you want to save it as US Ascii, use new XDeclaration("1.0", "us-ascii", "yes")

David Kemp
Is there any work around for this for legacy systems?
Brett
Yeah - use XDeclaration("1.0", "us-ascii", "yes")
David Kemp
A: 

The file is saved as UTF-8 with the byte-order-marker character at the start (this character begins with the octet 0xEF).

You validator for some reason seems not to like this character. Strictly speaking this character is whitespace and it is invalid to have whitespace preceeding the XML declaration. However, most parsers I know will skip it as being simply an indicator of unicode encoding and not treat it as content.

AnthonyWJones
The official W3C standard allows a BOM to appear before the XML declaration. Any validator that cannot handle this is non-standard.
Christian Hayter