views:

2362

answers:

5

Hi,

I have a C# .net web project that have a globalization tag set to:

<globalization requestEncoding="utf-8" responseEncoding="utf-8" culture="nb-no" uiCulture="no"/>

When this URL a Flash application (you get the same problem when you enter the URL manually in a browser): c_product_search.aspx?search=kjøkken (alternatively: c_product_search-aspx?search=kj%F8kken

Both return the following character codes:

k U+006b 107
j U+006a 106
� U+fffd 65533
k U+006b 107
k U+006b 107
e U+0065 101
n U+006e 110

I don't know too much about character encoding, but it seems that the ø has been given a unicode replacement character, right?

I tried to change the globalization tag to:

<globalization requestEncoding="iso-8859-1" responseEncoding="utf-8" culture="nb-no" uiCulture="no"/>

That made the request work. However, now, other searches on my page stopped working.

I also tried the following with similar results:

NameValueCollection qs = HttpUtility.ParseQueryString(Request.QueryString.ToString(), Encoding.GetEncoding("iso-8859-1"));
string search = (string)qs["search"];

What should I do?

Kind Regards,

nitech

A: 

i think your problem is in the flash, not the .net. it sends the special character in a weird way. try to urlencode the search string bevore you send it to the server.

cRichter
Well. In flash there is a function called escape. I've tried this and it makes no difference. Also, I get the same problem if I simply write the url in to the address field manually.
nitech
cRichter
A: 

If the app is expecting the URL-encoded request to be based on UTF-8, the character "ø" should be "%C3%B8", not "%F8". Whatever function you're using to escape/encode that request, you probably need to pass it the name of the underlying character encoding, "UTF-8".

Alan Moore
You're onto something there. It seems there is a variation on URL-encoding between AS2 and AS3. I made a sample on AS2 that works fine, but apparently the same sample encodes differently on AS3. This post tells more about this. It does however not provide a solution. http://www.ultrashock.com/forums/actionscript/as3-escape-vs-as2-escape-122046.html
nitech
And also, it doesn't make sense that I decode ø to U+fffd if the ultrashock forum is correct when it says AS3 encode with ISO-8859-1.
nitech
If your .NET backend is expecting the URL-encoded form to be based on UTF-8, it will see %F8 as an error and replace it with the standard replacement character, U+FFFD. What happens if you manually enter the string its UTF-8 form, "kj%C3%B8kken"?
Alan Moore
A: 

It turns out that ActionScript 2.0 will send the URL encoded/escaped with UTF-8 while ActionScript 3.0 used ISO-8859-1. The way to solve this was to change the Request.Encoding value inside Global.asax if an encoding is specified in the URL:

void Application_BeginRequest(object sender, EventArgs e)
{
    HttpContext ctx = HttpContext.Current;

    // encoding specified?
    if (!String.IsNullOrEmpty(Request["encoding"]))
    {
        ctx.Request.ContentEncoding = System.Text.Encoding.GetEncoding(ctx.Request["encoding"]);
    }        
}

Could it be done differently?

Regards, nitech

nitech
+1  A: 

The problem comes from the combination Firefox/Asp.Net. When you manually entered a URL in Firefox's address bar, if the url contains french or swedish characters, Firefox will encode the url with "ISO-8859-1" by default.

But when asp.net recieves such a url, it thinks that it's utf-8 encoded ... And encoded characters become "U+fffd". I couldn't find a way in asp.net to detect that the url is "ISO-8859-1". Request.Encoding is set to utf-8 ... :(

Several solutions exist :

  • put <globalization requestEncoding="iso-8859-1" responseEncoding="iso-8859-1"/> in your Web.config. But your may comme with other problems, and your application won't be standard anymore (it will not work with languages like japanese) ... And anyway, I prefer using UTF-8 !

  • go to about:config in Firefox and set the value of network.standard-url.encode-query-utf8 to true. It will now work for you (Firefox will encode all your url with utf-8). But not for anybody else ...

  • The least worst solution I could come with was to handle this with code. If the default decoding didn't work, we reparse QueryString with iso8859-1 :

    string query = Request.QueryString["search"];
    if (query.Contains("%ufffd"))
        query = HttpUtility.ParseQueryString(Request.Url.Query, Encoding.GetEncoding("iso-8859-1"))["search"];
    query = HttpUtility.UrlDecode(query);
    

It works with hyperlinks and manually-entered url, in french, english, or japanese. But I don't know how it will handle other encodings like ISO8859-5 (russian) ...

Does anyone have a better solution ?

This solves only the problem of manually-entered url. In your hyperlinks, don't forget to encode url parameters with HttpUtility.UrlEncode on the server, or encodeURIComponent on the javascript code. And use HttpUtility.UrlDecode to decode it.

Etienne Coumont
Thanks. If you look at the bottom - that solution worked for me. I was not able to change the encoding inside the cs-code. I had to do it before the session was initiated - inside global.asax. But perhaps your solution is working because it does not try to change the encoding of the HTTPContext.
nitech
A: 

thanks, nitech very cool.