According to the W3C (and they are the official source on these things), a space character in the query string (and in the query string only) may be encoded as either "%20
" or "+
". From the section "Query strings" under "Recommendations":
Within the query string, the plus sign is reserved as shorthand notation for a space. Therefore, real plus signs must be encoded. This method was used to make query URIs easier to pass in systems which did not allow spaces.
According to section 3.4 of RFC2396 which is the official specification on URIs in general, the "query" component is URL-dependent:
3.4. Query Component
The query component is a string of information to be interpreted by
the resource.
query = *uric
Within a query component, the characters ";", "/", "?", ":", "@",
"&", "=", "+", ",", and "$" are reserved.
It is therefore a bug in the other software if it does not accept URLs with spaces in the query string encoded as "+
" characters.
As for the third part of your question, one way (though slightly ugly) to fix the output from URLEncoder.encode()
is to then call replaceAll("\\+","%20")
on the return value.