Well i have a byte array, and i know its a xml serilized object in the byte array is there any way to get the encoding from it?
Im not going to deserilize it but im saving it in a xml field on a sql server... so i need to convert it to a string?
Well i have a byte array, and i know its a xml serilized object in the byte array is there any way to get the encoding from it?
Im not going to deserilize it but im saving it in a xml field on a sql server... so i need to convert it to a string?
The first 2 or 3 bytes may be a BOM which can tell you whether the stream is UTF-8, Unicode-LittleEndian or Unicode-BigEndian.
UTF-8 BOM is 0xEF 0xBB 0xBF Unicode-Bigendian is 0xFE 0xFF Unicode-LittleEndiaon is 0xFF 0xFE
If none of these are present then you can use ASCII to test for <?xml
(note most modern XML generation sticks to the standard that no white space may preceed the xml declare).
ASCII is use up until ?>
so you can find the precence of encoding= and find its value.
If encoding isn't present or <?xml
declare is not present then you can assume UTF-8.
A solution similar to this question could solve this by using a Stream over the byte array. Then you won't have to fiddle at the byte level. Like this:
Encoding encoding;
using (var stream = new MemoryStream(bytes))
{
using (var xmlreader = new XmlTextReader(stream))
{
xmlreader.MoveToContent();
encoding = xmlreader.Encoding;
}
}