tags:

views:

345

answers:

3

I am aware that a + in the query string of a URL represents a space. Is this also the case outside of the query string region? That is to say, does the following URL:

http://a.com/a+b/c

actually represent:

http://a.com/a b/c

(and thus need to be encoded if it should actually be a +), or does it in fact actually represent a+b/c ?

Thanks!

A: 

Thou shalt always encode URLs.

Here is how Ruby encodes your URL:

irb(main):008:0> CGI.escape "a.com/a+b"
=> "a.com%2Fa%2Bb"
Lennart
Sorry, allow me to clarify slightly. If the user types in "http://a.com/a+b/", then this is to be interpreted to mean a%20b and not a%2Bb?
Francisco Ryan Tolmasky I
I am not sure that's right. According to RFC2396 (http://www.ietf.org/rfc/rfc2396.txt) plusses are not reserved characters in the path (segments) of the URI, only the query component. That seems to imply that they don't need to be URL encoded and thus shouldn't be interpreted as spaces in the path, only in the query.
tlrobinson
Ah okay. It would be a%2Bb!
Lennart
rfc 1738 however does treat pluses as spaces. It all depends on which is implemented by your encode/decode functions. for example, in php, rawurlencode follows rfc 1738 whereas urlencode follows rfc 2396.
Jonathan Fingland
See, now I have some additional confusion. In the example you gave me above, a.com%2Fa%2Bb is not what I want, it would at the very least be a.com/a%2Bb. This is an actual URL I'm dealing with, not a URL being passed as a parameter in a query string. For a little background that may help to clarify, The Mac OS X Finder is returning file system URLs to me. So if I have a file named "a?+b.txt", it returns something that looks like "file://a%3F+b.txt", NOT "file://a%3F%2B.txt". Is the finder just incorrect, or is a + before the query string actually a plus?
Francisco Ryan Tolmasky I
Jonathan: Are you sure 1738 says + is reserved? I see: safe = "$" | "-" | "_" | "." | "+" unreserved = alpha | digit | safe | extraas well as: Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.
tlrobinson
A: 

You can find a nice list of corresponding URL encoded characters on W3Schools.

  • + becomes %2B
  • space becomes %20
Niels R.
+7  A: 
  • Percent encoding in the path section of a URL is expected to be decoded, but
  • any + characters in the path component is expected to be treated literally.

To be explicit: + is only a special character in the query component.

Stobor
+1 Unfortunately, many "URL coders/encoders" out there in the wild do not understand this. Eg http://www.sislands.com/coin70/week6/encoder.htmhttp://www.keyone.co.uk/tools-url-encoder.asphttp://meyerweb.com/eric/tools/dencoder/
leonbloy