views:

311

answers:

4

Hello!

Which characters are allowed in GET parameters without encoding or escaping them? I mean something like this:

http://www.example.org/page.php?name=XYZ

What can you have there instead of XYZ? I think only the following characters:

  • a-z (A-Z)
  • 0-9
  • -
  • _

Is this the full list or are there additional characters allowed?

I hope you can help me. Thanks in advance!

A: 

"." | "!" | "~" | "*" | "'" | "(" | ")" are also acceptable [RFC2396]. Really, anything can be in a GET parameter if it is properly encoded.

geowa4
but those have special meaning, so if you whant to *send* % or + you **have** to encode them.
voyager
yeah i don't know why i wrote %
geowa4
Thank you! I only want to know which characters can be used WITHOUT encoding or escaping them. I should have pointed out this better. So can I really use *!'()| without encoding them?
A: 

Alphanumeric characters and all of

~    -    _    .    !    *    '   (    )    ,   

are valid within an URL.

All other characters must be encoded.

womp
Thanks, you've understood everything correctly. I want to know which characters I can use without encoding them. Are you sure that !*'(), are such characters?
+2  A: 

There are reserved characters, that have a reserved meanings, those are delimiters — :/?#[]@ — and subdelimiters — !$&'()*+,;=

There is also a set of characters called unreserved characters — alphanumerics and -._~ — which are not to be encoded.

That means, that anything that doesn't belong to unreserved characters set is supposed to be %-encoded, when they do not have special meaning (e.g. when passed as a part of GET parameter).

See also RFC3986: Uniform Resource Identifier (URI): Generic Syntax

Michael Krelin - hacker
Thank you very much! So I have to add . and ~ to my list? Can I write index.php?page=start_en-new~. without escaping it?
It would be somewhat too bold a statement to say you can't, but you shouldn't. If you were to normalize URI you'd *have* to escape unreserved characters (and only unreserved), but it is very likely that it will actually *work* unescaped.
Michael Krelin - hacker
Generally, you have the escape function that escapes everything that needs to be escaped. And you normally use this function to escape *all* parameters you pass.
Michael Krelin - hacker
So I shouldn't use ~ and . unescaped, either? So only alphanumeric? Is urlencode() in PHP the function you mean? I could pass all characters to urlencode() and see what goes out unescaped!?
OMG, I haven't looked carefully at your example. I thought that was just a generic bunch of special characters ;-) No, you don't have to escape those, of course, as they are unreserved. Sorry for confusion. As for `urlencode()` I have no idea if it works correctly - it's not always the case with PHP functions - but if it does then yes, you can test with it ;-) Like I said - escape everything but unreserved.
Michael Krelin - hacker
:) Thanks. So I create a page with the name "~my_start-page.en" and pass the name via GET without any problems, correct? page.php?name=~my_start-page.en
Yes, that should be it. Those characters are safe as a query parameters with no escaping, so whether you will have problems processing that name later I don't know, but you can pass it with no problems ;-)
Michael Krelin - hacker
You're right, ~ and . seem to work fine. But what about the other answers here? They mention other characters which can be used unencoded as well. Why didn't you mention them? Are the other answers wrong?
I did mention RFC on URI syntax, didn't I? And the newest of all RFCs mentioned too! ;-) Actually, like I said, some other approaches to escaping may go unpunished, but still non standard-conformant. As long as URIs are to be normalized and compared for equality in normalized form the punishment will follow the crime ;-)
Michael Krelin - hacker
So the RFCs mentioned in the other questions are about 8 years older and contain special chars which aren't allowed unencoded anymore?
Michael Krelin - hacker
A: 

From RFC 1738 on which characters are allowed in URLs:

Only alphanumerics, the special characters "$-.+!'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

The reserved characters are ";", "/", "?", ":", "@", "=" and "&", which means you would need to URL encode them if you wish to use them.

ctford
Thanks! Are you sure that I can use $+!'()" without escaping them?