views:

264

answers:

2

I am retrieving an encoded url via querystring. I need to pass it again to the next page. When I retrieve it the first time, using $_REQUEST['url'], only the slashes are decoded, e.g:

http://example.com/search~S10?/Xllamas&searchscope=10&SORT=D/Xllamas&searchscope=10&SORT=D&SUBKEY=llamas/51%2C64%2C64%2CB/browse

The php docs page for urldecode advises against decoding request data, and says that it will already be decoded. I need it either completely decoded, so I can encode it again without double-encoding some parts, or not decoded at all.

I'm not sure why my experience of this data is incongruous with the php docs. Appreciate any help or pointers to same!!

EDIT: attempt to post relevant code, which is scattered about:

the url is encoded and added to the querystring (in an html file using smarty template):
<a class="button" href="{$baseurl}search_nojs?searcharg={$searcharg|escape:'url'}&url={$next|escape:'url'}"><span>Next&gt;&gt;</span></a>

if that link was followed, i'm grabbing the url back out of the querystring (in a php file):

       if(array_key_exists('url', $_REQUEST)) {
            $sm->assign("searchurl", $_REQUEST['url']);
        }

Then I'd like to stick the url back into the querystring for the next link (in another html file):
href="{$baseurl}detail?bibid={$res.bibid}&searcharg={$searcharg}{if $searchurl}&searchurl={$searchurl}{/if}"

I'm also printing {$searchurl} straight onto the page, and getting the same half-escaped result.

Here is another example of the querystring vs. the data i get from $_REQUEST:

originally encoded url in querystring:
searcharg=mammals&url=http%3A%2F%2Fexample.com%2Fsearch%7ES10%3F%2FXmammals%26searchscope%3D10%26SORT%3DD%2FXmammals%26searchscope%3D10%26SORT%3DD%26SUBKEY%3Dmammals%2F51%252C1114%252C1114%252CB%2Fbrowse

data retrieved from $_REQUEST:
searcharg=mammals&searchurl=http://example.com/search~S10?/Xmammals&amp;searchscope=10&amp;SORT=D/Xmammals&amp;searchscope=10&amp;SORT=D&amp;SUBKEY=mammals/51%2C1114%2C1114%2CB/browse

I know this method may seem curious -- I am trying to make a mobile display, working around a black-box database. Thanks again for any help!!

A: 

The comma (U+002C) is a reserved character in the query and thus must be encoded with %2C:

3.4. Query Component

The query component is a string of information to be interpreted by the resource.

  query         = *uric

Within a query component, the characters ";", "/", "?", ":", "@", "&", "=", "+", ",", and "$" are reserved.

Gumbo
I can't see a comma in his URL, but he has a forward slash in the name of the first variable and in some values, which could cause problems.
arnorhs
Gumbo: does this mean I could use urldecode? (p.s. additional info added to question)
hackmaster.a
+1  A: 

Here is another example of the querystring vs. the data i get from $_REQUEST:

originally encoded url in querystring: searcharg=mammals&url=http%3A%2F%2Fexample.com%2Fsearch%7ES10%3F%2FXmammals%26searchscope%3D10%26SORT%3DD%2FXmammals%26searchscope%3D10%26SORT%3DD%26SUBKEY%3Dmammals%2F51%252C1114%252C1114%252CB%2Fbrowse

This is double encoded. For example: %252C -> %2C -> , So at the point that you encode the url parameter, you're introducing double encoding. Perhaps you should ensure that, before encoding parameters, you decode them until they can be decoded no more (aka canonicalisation). You could use urldecode in a loop for this.

You also want to ensure that when you put the url parameter back into html context (as a link) that you escape for HTML Attributes too. Otherwise you have an XSS vulnerability.

jah
This seems very helpful I will work on it from this angle.
hackmaster.a
I found the spot where it needed to be decoded on first retrieval. Thanks so much, jah! I will keep in mind your other tip, also.
hackmaster.a