views:

673

answers:

2

I'm trying to call a rest webservice provided by a lims system (basically a chemistry lab database + interface). It was working great until some > ascii characters showed up (specifically characters with circumflexes, umlauts, etc.)

When calling the webservice passing the value àèïõû I have the following argument:

&componentValue=àèïõû

HttpWebRequest, without any pre-escaping OR with Uri.EscapeDataString() called on the value gives:

à èïõû

Firefox, with the same website as was passed to HttpWebRequest gives the correct value:

àèïõû

Now for the escaping itself: Uri.EscapeDataString() appears to escape "àèïõû" as:

%C3%A0%C3%A8%C3%AF%C3%B5%C3%BB

Firefox escapes "àèïõû" as:

%E0%E8%EF%F5%FB

As the latter works I would of course prefer to use that as my escape method, but I really don't know where to begin. I've found plenty of information on different methods of handling encodes on the response data, but not on the request.

A: 

From MSDN:

Uri.EscapeDataString Method

[...] All Unicode characters are converted to UTF-8 format before being escaped.

So what you're seeing is the UTF-8 encoded version of àèïõû.

Unlike Uri.EscapeDataString, HttpUtility.UrlEncode allows you to specify an encoding explicitly:

HttpUtility.UrlEncode("àèïõû", Encoding.GetEncoding("latin1"));

Alternatively, you could write your own version; for example:

string.Concat(Encoding
   .GetEncoding("latin1")
   .GetBytes("àèïõû")
   .Select(x => "%" + x.ToString("x2"))
   .ToArray());

Both result in "%e0%e8%ef%f5%fb".

A better solution would probably be to accept UTF-8 encoded query strings in the webservice.

dtb
thanks for the quick answer and extended information - don't have control of the webservice, but have sent in the information to the vendor.
Chris B
A: 

It appears that Uri.HexEscape() will do what you want, but only one character at a time. I'd roll your own escaping function and hope that your codepage is always the same codepage that the webservice is using, since it appears that the webservice doesn't support Unicode.

Lee