views:

126

answers:

1

I'm using Django with Google's App Engine.

I want to send information to the server with percent encoded slashes. A request like http:/localhost/turtle/waxy%2Fsmooth that would match against a URL like r'^/turtle/(?P<type>([A-Za-z]|%2F)+)$'. The request gets to the server intact, but sometime before it is compared against the regex the %2F is converted into a forward slash.

What can I do to stop the %2Fs from being converted into forward slashes? Thanks!

A: 

os.environ['PATH_INFO'] is decoded, so you lose that information. Probably os.environ['REQUEST_URI'] is available, and if it is available it is not decoded. Django only reads PATH_INFO. You could probably do something like:

request_uri = environ['REQUEST_URI']
request_uri = re.sub(r'%2f', '****', request_uri, re.I)
environ['PATH_INFO'] = urllib.unquote(request_uri)

Then all cases of %2f are replaced with **** (or whatever you want to use).

Ian Bicking
What sets REQUEST_URI? I don't see it in pep 333's required wsgi variable list.
Forest
It's not required in PEP 333, but it is widely set as part of CGI (or CGI-like) requests. On GAE if it exists, then it is certain to keep existing. It is the complete request path, with no URL unencoding done to it.
Ian Bicking
`os.environ['REQUEST_URI']` is not available on GAE.
David Underhill
Then the information is lost and there's no way to distinguish %2f from /
Ian Bicking
Another technique that works for GData is that you have things like `/{http:%2f%2ffoo}attr/` (which by the time it gets to WSGI looks like `/{http://foo}attr/`), but because it uses nested braces you can parse out the chunks. You could design your URLs similarly.
Ian Bicking
I think that the suggestion of bracketing the names is probably the best solution short of hacking Django. Thanks for your help, Ian!
J. Frankenstein
Interesting approach. Beware, though: curly braces are not valid characters in the latest generic URI syntax rules ( http://tools.ietf.org/html/rfc3986#section-2.2 ) and they were specifically disallowed in the previous rules ( http://tools.ietf.org/html/rfc2396#section-2.4.3 ).
Forest
Quite right, they should be escaped like `/%7bhttp:%2f%2ffoo%7dattr/`
Ian Bicking