views:

412

answers:

5

Say the path of your URL is:

/thisisa"quote/helloworld/

Then how do you create the rel=canonical URL?

Is this kosher?

<link rel="canonical" href="/thisisa&amp;quot;/helloworld/" />

UPDATE

To clarify, I'm getting a form submission, I need to convert part of the query string into the URL. So the steps are:

  1. .htaccess does the redirect
  2. PHP processes a directory as a query string.
  3. The query string will be dynamically inserted into the:
    • Title,
    • Description,
    • Keywords
    • Canonical URL.
    • Spit back into the form's input box

So I need to know which processing has to be done each step of the way...On the first cut, this is my take:

  • Title: htmlspecialchars($rawQuery)
  • Description: htmlspecialchars($rawQery)
  • Keywords: htmlspecialchars($rawQuery)
  • Canonical URL: This is the tricky part. It must match the same URL .htaccess redirects to but even so, I think the raw query is unsafe because quotes can cause JavaScript injection. Worried about urlencode($rawquery) since it's coming from the URL, wouldn't it already be URL-encoded?
  • Spit back into form: htmlspecialchars($rawQuery)
+4  A: 

Use URL escaping, in this case %22

http://everything2.com/title/URL+escape+sequences

lod3n
A: 

I would say you want to use the HEX value for a quote which is %22.

Read this to learn more about URL Encoding.

Joe Philllips
+1  A: 

A quote is not even a valid URL character, so I think long-term you should address this. It is specifically excluded from the URI syntax by RFC 2396.

To solve the immediate problem though, you'll need to escape the character, using %22.

womp
In reality, I'm not choosing the url. I'm having to partially convert a query string into a url...and need to make sure that query string is safe whereever it's used. I'm going to update the question with some follow ups.
joedevon
A: 

If the URL contains a double quote then contain it with single quotes.

<link rel="canonical" href='foo.com/thisisa"/helloworld/' />

Do not use HTML encoding in URI strings. That is invalid syntax as the ampersand must be encoded in URIs since it is a function special character. Instead always use percent encoding for URIs.

+2  A: 
Gumbo