views:

436

answers:

2

I'd like to use NSURL's parameterString method to parse out the parameters of a URL I've been passed. It says URL must conform to RFC 1808 but now wondering if ours do?!? We use something like:

http://server/path/query?property1=value1&property2=value2

but RFC 1808 never mentions the ampersand (&) as a valid parameter separator (at least the way I read it). It suggests the semicolon (;). Perhaps because it was drafted in 1995? Has the & replaced the ;? If so anyone verify if NSURL's parameterString will also parse with & as delimiter?

What's the "right" way before we dig a big a hole?

A: 

RFC1808 doesn't define the internal format of the query string. I believe the semicolon stuff 1808 is talking about is additional information of a different sort (on paths), which is in practice never used. As far as I can see, the NSURL interface does not include any methods that deal with parsing/splitting the contents of the query string itself, so this is of no interest to the class, and indeed your URL is 1808-compliant.

Actually query strings don't inherently have any RFC-defined format; you can perfectly well put any string in them and retrieve them untouched at the server side. However the HTML standard describes a way of creating query strings from form contents, and this application/x-www-form-urlencoded format is used by most server-side scripts.

According to HTML4 section 17.13.4.1, & is the parameter separator browsers must use to create query strings from multiple parameters, so yes, you must support the ampersand as a parameter separator. HTML4 recommends that server-side scripts should accept the semicolon as an alternative separator to the ampersand in query strings, as that avoids more escaping. But it doesn't require it, and indeed (unfortunately) many server/form-reading environments do not accept the semicolon for this purpose.

bobince
+1  A: 

According to RFC 1808 (2.1. URL Syntactic Components) correct syntax is as follows :

<scheme>://<net_loc>/<path>;<params>?<query>#<fragment>

It says query information is formatted as per Section 3.3 of RFC 1738 which tells us :

"Within the path and searchpart components, "/", ";", "?" are reserved."

To me the above says that in your URL the path (to your CGI) is :

http://server/path/query

and the query is :

property1=value1&property2=value2 

Which does not contain any reserved characters. So you are OK. In fact the use of the "&" as a separator in the query string here derives from the CGI specification and not the URL RFC :

"Form data is a stream of name=value pairs separated by the & character."

Tyr
very helpful...thanks! hard to **search** for this stuff on Web because every friggin' URL "hits" in one way or another.
Meltemi