Hello,
My Web host has refused to help me with this, so I'm coming to the wise folks here for some help "black-box debugging". Here's an edited version of what I sent to them:
I have two (among other) domains at dreamhost:
1) thefigtrees.net 2) shouldivoteformccain.com
I noticed today that when I host a CGI script on #1, that by the time the CGI script runs, the HTTP GET query string passed to it as the QUERY_STRING environment variable has already been URL decoded. This is a problem because it then means that a standard CGI library (such as perl's CGI.pm) will try to split on ampersands and then decode the string itself. There are two potential problems with this:
1) the string is doubly-decoded, so if a value is submitted to the script such as "%2525", it will end up being treated as just "%" (decoded twice) rather than "%25" (decoded once)
2) (more common) if there is an ampersand in a value submitted, then it will get (properly) submitted as %26, but the QUERY_STRING env. variable will have it already decoded into an "&" and then the CGI library will improperly split the query string at that ampersand. This is a big problem!
The script at http://thefigtrees.net/test.cgi demonstrates this. It echoes back the environment variables it is called with. Navigating in a browser to:
http://thefigtrees.net/lee/test.cgi?x=y%26z
You can see that REQUEST_URI properly contains x=y%26z (unencoded) but that QUERY_STRING already has it decoded to x=y&z. If I repeat the test at domain #2 ( http://www.shouldivoteformccain.com/test.cgi?x=y%26z ) I see that the QUERY_STRING remains undecoded, so that CGI.pm then splits and decodes correctly.
I tried disabling my .htaccess files on both to make sure that was not the problem, and saw no difference.
Could anyone speculate on potential causes of this, since my Web host seems unwilling to help me?
thanks, Lee