views:

2760

answers:

3

Well i have a byte array, and i know its a xml serilized object in the byte array is there any way to get the encoding from it?

Im not going to deserilize it but im saving it in a xml field on a sql server... so i need to convert it to a string?

+1  A: 
Jon Skeet
Downvoters: if you're going to downvote, please provide a comment. Otherwise the downvote serves no real purpose.
Jon Skeet
+3  A: 

The first 2 or 3 bytes may be a BOM which can tell you whether the stream is UTF-8, Unicode-LittleEndian or Unicode-BigEndian.

UTF-8 BOM is 0xEF 0xBB 0xBF Unicode-Bigendian is 0xFE 0xFF Unicode-LittleEndiaon is 0xFF 0xFE

If none of these are present then you can use ASCII to test for <?xml (note most modern XML generation sticks to the standard that no white space may preceed the xml declare).

ASCII is use up until ?> so you can find the precence of encoding= and find its value. If encoding isn't present or <?xml declare is not present then you can assume UTF-8.

AnthonyWJones
+6  A: 

A solution similar to this question could solve this by using a Stream over the byte array. Then you won't have to fiddle at the byte level. Like this:

Encoding encoding;
using (var stream = new MemoryStream(bytes))
{
    using (var xmlreader = new XmlTextReader(stream))
    {
        xmlreader.MoveToContent();
        encoding = xmlreader.Encoding;
    }
}
Peter Lillevold