views:

175

answers:

3

I'm looking for the best URL schema to use for a web app that has multiple versions, namely several languages and a simplified version for use by mobile phones - both aspects can be combined, so there's an English regular and mobile version, a German regular and mobile version, etc.

Goals (in order of importance):

  • User-friendliness
  • Search engine friendliness
  • Ease of development

Aspects to consider:

  • How should the URLs look like?
  • How should the user navigate between versions?
  • How much logic should there be to automatically decide on a version?

I'll describe my concept so far below, maybe some of you have better ideas.

A: 

My current concept:

  • When a new user arrives, the app decides, based on cookies (see below), the Accept-Language: header and the user agent string (used to identify mobile browsers) which version to show, but does not reflect this in the URL (no redirects)
  • It defaults to the non-simplified English version
  • There are prominently displayed icons (flags, a stylized mobile phone) to choose other versions
  • When the user explicitly chooses a different version, this is reflected both in a changed URL and a browser cookie
  • The URL schema is / for the "automatic" version, /en/, /de/, etc. for the language version, /mobile/ for the simplified version, /normal/ for the non-simplified one, and combinations thereof i.e. /mobile/en/ and /normal/de/
  • mod_rewrite is used to strip these URL prefixes and convert them to GET parameters for the app to parse
  • robots.txt disallows /mobile/ and /normal/

Advantages:

  • The different language versions are all indexed separately by search engines
  • Cookies help, but are not necessary
  • There'S a good chance that people will see the version that's ideal for them without having to make any choice
  • The user can always explicitly choose which version he wants (this makes the /normal/ URL necessary)
  • Each version has an URL which will display exactly that version when passed to others
  • /mobile/ and /normal/ are ignored by search engines; they would only be duplicate content.

Disadvantages:

  • Requires heavy use of mod_rewrite, which I find rather cryptic
  • Users could send their current URL to someone and that person, when visiting it, could end up seeing a different version, which could cause confusion
  • There is still duplicate content between / and /en/ - I can't disallow / in robots.txt - should I trust the search engines not to penalize me for exact duplicate content on the same domain, or disallow /en/ and accept that people coming to / via a search engine may see a different version than what they found in the search engine?
Michael Borgwardt
+1  A: 

I am unclear why you would want to incorporate any kind of what you call versioning information, such as accept-language or user-agent, specific designation in the URL scheme. The URL scheme should be indicative of the content only. The server should investigate the various request headers to determine how to retrieve and/or format the response.

Glenn
With the user-agent specific version, I agree, since that differs only in formatting. But the different language versions are, at least from the POV of a search engine (arguably also from a human POV), different content - and you definitely want searches in all languages to find your pages.
Michael Borgwardt
You'll be able to tell from the user-agent when a search engine is fetching the resource. So, why not include the content from all of the languages in that response?
Glenn
Then someone who looked for a German search term gets an English page when he does not have his browser configured to send an Accept-Language: header. Also, changing URLs is the only way to let users choose their language (or mobile/non-mobile) when they have deactivated cookies.
Michael Borgwardt
+2  A: 

I suggest subdomains, personally.

I wouldn't include the mobile at all - use the useragent to determine this, and possibly a cookie incase the user wants to view the full site on their mobile (think how Flickr and Google do it). But for languages, yes - primary language at http://mydomain.com/, secondary languages at i.e. http://de.mydomain.com/ or http://fr.mydomain.com/

elliottcable
And if you don't ever display content at http://mydomain.com/ but instead redirect to the default (or user-setting-dependent) language subdomain then you don't even have duplicate content.
Joachim Sauer
I'm more concerned about making sure it's possible to see the mobile site with unrecognized exotic mobile user agents - and I don't want to depend on cookies for that.
Michael Borgwardt
Well, I guess you're stuck with something in the URL. Perhaps a second subdomain? http://mobile.fr.mydomain.com/ as opposed to http://fr.mydomain.com/? (With http://mobile.mydomain.com/ redirecting to http://mobile.en.mydomain.com/ as http://mydomain.com/ redirects to http://en.mydomain.com/)
elliottcable