views:

366

answers:

2

I'm trying to import an xml file into vb.net XmlDocument but am getting the error:

'.', hexadecimal value 0x00, is an invalid character. Line 94, position 1.

I'm wondering if there is a way to replace the hex character 0x00
The following are lines 92,93,94 the file ends on line 94

92 |    </request>
93 |</rpc> <!-- XML Finishes here -->
94 |

Thanks for any help.

EDIT: adding code used to get the file.

Dim fs As FileStream = File.Open(FileName, FileMode.Open, FileAccess.Read)
Dim buffer(fs.Length) As Byte
fs.Read(buffer, 0, fs.Length)
Dim xmlString As String = System.Text.UTF8Encoding.UTF8.GetString(buffer)
fs.close()

Doc.LoadXml(xmlString.Trim)

I am using System.Text.UTF8Encoding.UTF8.GetString(buffer) because the file encoding is not always UTF-8. Unfortunatly I don't have controll over the xml file as we are receiving it from an external source who wont change the way the file is generated as it is used by others.

What I want to do is basically get the file into the string then either chop off the end of it from the last > and then append my own > or just replace the HEX character with an empty string.

A: 

If you have invalid XML, then you'll have to correct it as a regular binary file before parsing it as an XML document.

Stephen Cleary
+3  A: 

Okay, to start with your code for reading a file is broken. It will usually work, but you should pretty much never ignore the return value from Stream.Read. You should also close streams using a Using statement or Finally block. Fortunately, there's an incredibly easy way of replacing your code:

Dim xmlString As String = File.ReadAllText(FileName)
Doc.LoadXml(xmlString)

On the other hand, you claim that the encoding isn't always UTF-8 - so why are you always trying to use UTF-8? It would actually be better if you loaded it as plain bytes:

Dim bytes As Byte() = File.ReadAllBytes(FileName)
Using stream As MemoryStream = new MemoryStream(bytes)
    Doc.Load(stream)
End Using

or more easily:

Doc.Load(FileName)

Now, if you do that do you still get the same error? If so, the file itself is broken...

Jon Skeet
I was using `Doc.Load(File)` to begin with, which brought up the encoding issue and it also brought up the current hex issue. I will try the other suggestions you have and see if they work.
Nalum
Using the `Using stream As MemoryStream = new MemoryStream(bytes)` seems to have fixed the hex issue and the encoding issue. Thank you Jon
Nalum