views:

87

answers:

2

I am developing a generic HTTP handler in VS2005 and testing it in Debug Mode. It works well except when the query string contains higher-bit characters, e.g. Latin Small Letter Thorn /u00FE þ and Latin Small Letter Ae /u00E6 æ.

IE8 on my machine is set to send UTF-8 URLs. I am typing the following into the IE8 address bar when debugging the code:

    http://app/myHandler.ashx?term=foo  // everything works
    http://app/myHandler.ashx?term=þorn  // does not work -- query from database fails

The database is SQLite and it is using UTF-8 encoding and it works fine. The queries that use these special characters work fine when issued directly against SQLite using other GUI tools or using the System.Data.SQLite GUI add-ins to Visual Studio.

Am I decoding the values from the Query String correctly? Does GetString() not decode the bytes?

  public StandardRequest(HttpContext context)
    {
        UTF8Encoding utf8 = new UTF8Encoding();

        if (context.Request.QueryString["term"] != null)
        {            
            byte[] w = utf8.GetBytes(context.Request.QueryString["term"]);
            word = utf8.GetString(w);
          ...

In the HTTP handler, ContentEncoding is set to UTF-8:

     context.Response.ContentEncoding = System.Text.Encoding.UTF8;

and in the debugger's local's window, Request.ContentEncoding is also UTF-8.

But when I examine the query string value in the locals window, the term value from the query string 'þorn' is being displayed as '[]orn' and that is how it is displayed in the sql statement that I'm sending through to the database. It's as if the character hasn't been recognized.

Am I doing something wrong in the way the value is being grabbed from the query string and converted to a string?

Thanks very much for the help.

A: 

What does context.Request.QueryString["term"] contain in integer, before decoding? Maybe it already has the value you want. If the current bytes aren't in UTF8, utf8.GetBytes won't help.

eed3si9n
A: 

Thanks for the tip, eed3si9n. It led me to the solution.

I was under the (mistaken) impression that IE would convert characters typed by hand into the address bar into the encoding specified in Settings. It doesn't. The URL typed there must already be encoded.

Tim