views:

50

answers:

1

I've been trying to trace down a bug for hours now and it has come down to this:

Dim length as Integer = 300
Dim buffer() As Byte = binaryReader.ReadBytes(length)
Dim text As String = System.Text.Encoding.UTF8.GetString(buffer, 0, buffer.Length)

The problem is the buffer contains 300 bytes but the length of the string 'text' is now 285. When I convert it back to bytes, the length is 521 bytes... WTF?

The same code is a normal WinForms app works perfectly. The data being read by the binary reader is a UTF8 encoded string. Any ideas why Silverlight is playing funny buggers?

A: 

I bet your stream contains some characters that require more than one byte. UTF8 uses a single byte when possible, but uses more bytes when the character is outside the ASCII range.

This explains why your buffer is longer than the string (300 vs 285).

Example:

string: "t      e      s      t      ä        " (length = 5 -last char takes 2 bytes)
bytes:   0x74 | 0x65 | 0x73 | 0x74 | 0xc3 0xa4  (length = 6)

As to why it becomes even longer when you convert the text back to bytes, my best guess (also looking at the 521 size you get) is that you are using Encoding.Unicode instead of Encoding.UTF8 to perform the conversion. Unicode always uses two bytes for each character.

(btw. obviously this has nothing to do with Silverlight. You are probably testing the code with two different strings in Winforms vs. Silverlight. No worry, we've all done stupid mistakes like that :-) )

Francesco De Vittori
Probably close to an answer however what is 2 * 285? Its not 521.
AnthonyWJones
Assuming we have 280 single-byte chars + 5 double-byte chars (i.e. 300 bytes/285 chars), the same string in unicode would not be 285*2 but 280*2+5 = ...565 Something still does not add up :-)
Francesco De Vittori
Thanks for the help. I'm using UTF8.GetBytes to perform the conversion and the strings are 100% identical.
Alex
I've done some investigating and I think it is the WebClient I use to download the string. If I use the EXACT code in normal WinForms application and Silverlight, the Web Client and HttpWebRequest both have trouble returning strings in Silverlight...
Alex
The code in your question is working fine (keeping in mind the UTF8 caveats mentioned above). Could you please post a more complete code snippet, so that we can try and reproduce the error?
Francesco De Vittori