tags:

views:

105

answers:

5

Does anyone know the full list of characters that can be used within a GET without being encoded? At the moment I am using A-Z a-z and 0-9... but I am looking to find out the full list.

I am also interested into if there is a specification released for the up coming addition of Chinese, Arabic url's (as obviously that will have a big impact on my question)

+3  A: 

Here's a handy table:

http://en.wikipedia.org/wiki/Percent-encoding#Types%5Fof%5FURI%5Fcharacters

Amber
+3  A: 

From here

Thus, only alphanumerics, the special characters "$-.+!'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

AdaTheDev
+5  A: 

From RFC 1738:

Thus, only alphanumerics, the special characters "$-.+!'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.

Myles
+2  A: 

These are listed in RFC3986. See the Collected ABNF for URI to see what is allowed where and the regex for parsing/validation.

McDowell
A: 

The upcoming change is for chinese, arabic domain names not URIs. The internationalised URIs are called IRIs and are defined in RFC 3987. However, having said that I'd recommend not doing this yourself but relying on an existing, tested library since there are lots of choices of URI encoding/decoding and what are considered safe by specification, versus what are safe by actual use (browsers).

dajobe