views:

668

answers:

5

I am sending a large string from Delphi 5 to a C# web service, and I'm having lots of trouble with Pound (£) signs. I URLEncode the string from the Delphi side (which seems to convert them to '%A3'). When it reaches the C# web services it appears as '�'. I have tried changing the encoding of the string on the C# side by using a StreamReader (shown below), but the best I can get it to do is to change to to a question mark (?).

MemoryStream mr = new MemoryStream(System.Text.Encoding.Default.GetBytes(myString));
StreamReader sr = new StreamReader(mr, System.Text.Encoding.Default);
string s = sr.ReadToEnd();

How can I get the £ signs to be interpreted correctly?

Please help!

(Further info requested)

The web service signature is:

[WebMethod]
public string ReadMyString(string PostedString)

The Delphi 5 code uses third party components/code that we've been using successfully for years, but this is the first time we've tried talking directly to C#. An outline of the code is shown below:

 tmp_Str := URLEncode(myBigString);
 tmp_Str := WinInetPostData(myURL, tmp_Str);

Between these two lines I have confirmed that the £ signs have been correctly converted to '%A3'.

A: 

I would recommend using the Base64 encoding on the Delphi side (add the unit EncdDecd to your uses clause and method EncodeString) and decode it on the C# side by..

public static string DecodeString(string base64EncodedString)
  {
     byte[] dataToDecode = Convert.FromBase64String(base64EncodedString);
     string result = Encoding.ASCII.GetString(dataToDecode);
     return result;
  }

Good luck.

KevinRF
Thanks for your suggestion. I tried it and still get the question marks I'm afraid. I'm convinced the issue is on the C# side...
JamesW
A: 

As far as I know string in .NET are UTF-16 so you may try to use the corresponding Widestring in Delphi < 2009. In Delphi 2009 strings are UTF-16 by default but make sure you enforce the encoding.

Tihauan
Sorry I should have mentioned that it's Delphi 5 (ancient!) - but I've tried your suggestion and didn't have any luck I'm afraid. Thanks anyway
JamesW
A: 

From the .Net help for System.Text.Encoding.Default:

"Gets an encoding for the system's current ANSI code page."

Looks like 0xA3 isn't in that code page as a pound sign. Switch it to UTF-8 and it should decode that particular character correctly, but whether that is the correct encoding overall (whether that is what delphi is emitting) I can't say.

You can switch to UTF8 by changing your first line like so:


MemoryStream mr = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(myString)

locster
Thanks but I still get the little squares with your suggestion.
JamesW
Scrub that. A pound sign will be two bytes in UTF8. Try specifying a UK code page using the GetEncoding(int) overload. E.g. GetEncoding(1252) for a UK code page.See (http://www.lingoes.net/en/translator/codepage.htm) for a list of code pages.
locster
+1  A: 

Got it! The URLEncode function in Delphi (which uses a third-party component called NMURL) is encoding £ as '%A3', when it should in fact be '%C2%A3'. I did a manual replace on the Delphi side to correct it and then it requires no manipulation at all on the C# side.

Thanks for all your suggestions. That'll teach me to put my faith in old components!

JamesW
Do yourself a **giant** favor and ditch the NetMasters code with all haste! It was buggy when it was released with Delphi, and now it's still buggy, but also unsupported. Switch to a good library like Indy or ICS as soon as possible.
Rob Kennedy
+3  A: 

Based on what you wrote in your own answer, it looks like the problem is in how the client side is encoding the string, not in how the server is interpreting it (although the server needs to cooperate no matter what encoding you use). You're evidently expecting it to be encoded as UTF-8 (that's the default for StreamReader if you don't specify anything else), but I wouldn't be surprised if the NetMasters library you're using doesn't even know about UTF-8 or any other form of Unicode.

Delphi 5 can handle Unicode just fine via its WideString type, but it lacks a lot of support utility functions. If you want to keep your code with NetMasters, then the minimal change for you is to introduce a Unicode-enabled library, such as the JclUnicode unit from the free JCL. There you can find a Utf8Encode function that will receive a WideString and return an AnsiString, which is then suitable for passing to your existing URL-encoding function.

Better would be to get rid of the NM code altogether. The free Indy library has functions for UTF-8-encoding and URL-encoding, as well as all your other Internet-related tasks.

If you're not using Unicode on the client side, then there's no reason to expect "£" to ever be encoded as the two-byte sequence c2 a3. That's the UTF-8-encoded form of U+00a3, the code point for the pound character.

If you're not using Unicode on the client, then you'll have to find out what code page you are using. Then, specify that encoding on the server when you create the new StreamReader.

Rob Kennedy