ansaurus

Question

What's the difference between Request.Url.Query and Request.QueryString?

Answer 1

+2 A:

Your question really sparked my interest, so I've done some reading for the past hour or so. I'm not absolutely positive I've found the answer, but I'll throw it out there to see what you think.

From what I've read so far, Request.QueryString is actually "a parsed version of the QUERY_STRING variable in the ServerVariables collection" [reference] , where as Request.Url is (as you stated) the raw URL encapsulated in the Uri object. According to this article, the Uri class' constructor "...parses the [url string], puts it in canonical format, and makes any required escape encodings."

Therefore, it appears that Request.QueryString uses a different function to parse the "QUERY_STRING" variable from the ServerVariables constructor. This would explain why you see the difference between the two. Now, why different encoding methods are used by the custom parsing function and the Uri object's parsing function is entirely beyond me. Maybe somebody a bit more versed on the aspnet_isapi DLL could provide some answers with that question.

Anyway, hopefully my post makes sense. On a side note, I'd like to add another reference which also provided for some very thorough and interesting reading: http://download.microsoft.com/download/6/c/a/6ca715c5-2095-4eec-a56f-a5ee904a1387/Ch-12_HTTP_Request_Context.pdf

regex 2010-02-08 06:21:55

Both properties return the same encoded string most of the time - the constructors and parsing are irrelevant in this case. It's only after the rewrite call that the Uri's encoding changes.

zombat 2010-02-08 17:32:26

Answer 2

+1 A:

What you indicated as the "broken" encoded string is actually the correct encoding according to standards. The one that you indicated as "correct" encoding is using a non-standard extension to the specifications to allow a format of %uXXXX (I believe it's supposed to indicate UTF-16 encoding).

In any case, the "broken" encoded string is ok. You can use the following code to test that:

Uri uri = new Uri("http://www.example.com/test.aspx?search=heřmánek");
Console.WriteLine(uri.Query);
Console.WriteLine(HttpUtility.UrlDecode(uri.Query));

Works fine. However... on a hunch, I tried UrlDecode with a Latin-1 codepage specified, instead of the default UTF-8:

Console.WriteLine(HttpUtility.UrlDecode(uri.Query, 
           Encoding.GetEncoding("iso-8859-1")));

... and I got the bad value you specified, 'heÅmÃ¡nek'. In other words, it looks like the call to HttpContext.RewritePath() somehow changes the urlencoding/decoding to use the Latin-1 codepage, rather than UTF-8, which is the default encoding used by the UrlEncode/Decode methods.

This looks like a bug if you ask me. You can look at the RewritePath() code in reflector and see that it is definitely playing with the querystring - passing it around to all kinds of virtual path functions, and out to some unmanaged IIS code.

I wonder if somewhere along the way, the Uri at the core of the Request object gets manipulated with the wrong codepage? That would explain why Request.Querystring (which is simply the raw values from the HTTP headers) would be correct, while the Uri using the wrong encoding for the diacriticals would be incorrect.

womp 2010-02-08 06:34:05

ansaurus

tags:

views:

answers:

What's the difference between Request.Url.Query and Request.QueryString?

related questions