views:

388

answers:

9

Scenario

The web server gets a request for http://domain.com/folder/page. The Accept-Language header tells us the user prefers Greek, with the language code el. That's good, since we have a Greek version of page.

Now we could do one of the following with the URL:

  1. Return a Greek version keeping the current URL: http://domain.com/folder/page
  2. Redirect to http://domain.com/folder/page/el
  3. Redirect to http://domain.com/el/folder/page
  4. Redirect to http://el.domain.com/folder/page
  5. Redirect to http://domain.com/folder/page?hl=el
  6. ...other alternatives?

Which one is best? Pros, cons from a user perspective? developer perspective?

A: 

I prefer 3 or 4

x2
+2  A: 

Number four is the best option, because it specifies the language code pretty early. If you are going to provide any redirects always be sure to use a canonical link tag.

+13  A: 

I would not go for option 1, if your pages are publically available, i.e. you are not required to log in to view the pages. The reason is that a search engine will not scan the different language versions of the page. The same reason goes agains option no 5. A search engine is less likely to identify two pages as separate pages, if the language identification goes in the query string.

Lets look at option 4, placing the language in the host name. I would use that option if the different language versions of the site contains completely different content. On a site like Wikipedia for example, the Greek version contains its own complete set of articles, and the English version contains another set of articles.

So if you don't have completely different content (which it doesn't seem like from your post), you are left with option 2 or 3. I don't know if there are any compelling arguments for one over the other, but no. 3 looks nicer in my eyes. So that is what I would use.

But just a comment for inspiration. I'm currently working on a web application that has 3 major parts, one public, and two parts for two different user types. I have chosen the following url scheme (with en referring to language of course):

http://www.example.com/en/x/y/z for the public part.
http://www.example.com/part1/en/x/y/z for the one private part
http://www.example.com/part2/en/x/y/z for the other private part.

The reason for this is that if I were to split the three parts up into separate applications, it will be a simple reconfiguration in the web server when I have the name of the part at the top of the path. E.g. if we were to use a commercial CMS system for the public part of the site

Edit: Another argument against option no. 1 is that if you ONLY listen to accept-language, you are not giving the user a choice. The user may not know how to change the language set up in a browser, or may be using a frinds computer setup to a different language. You should at least give the user a choice (storing it in a cookie or the user's profile)

Pete
My question is supposed to be independent of how the user selects the language, by clicking a link or by changing the browser language.
Arne Evertsson
Arne Evertsson
Pete
I guess it could work technically. But then why wouldn't I add another parameter instead of a weird ending slash and a folder name?
Arne Evertsson
Pete
+2  A: 

My own choice is #3: http://domain.com/el/folder/page. It seems to be the most popular out there on the web. All the other alternatives have problems:

  1. http://domain.com/folder/page --- Bad for SEO?
  2. http://domain.com/folder/page/el --- Doesn't work for pages with parameters. This looks weird: ...page?par1=x&par2=y/el
  3. http://domain.com/el/folder/page --- Looks good!
  4. http://el.domain.com/folder/page --- More work needed to deploy since it requires adding subdomains.
  5. http://domain.com/folder/page?hl=el --- Bad for SEO?
Arne Evertsson
Could you elaborate on your explanation for the second URL, please.
Gumbo
Added an example above.
Arne Evertsson
Brian
+2  A: 

Pick option 5, and I don't believe it is bad for SEO.

This option is good because it shows that the content for say:
http://domain.com/about/corporate/locations is the same as the content in
http://domain.com/about/corporate/locations?hl=el except that the language differs.
The hl parameter should override the Accept-language header so that the user can easily control the matter. The header would only be used when the hl parameter is missing. Granted linking is a little complicated by this, and should probably be addressed through either a cookie which would keep the redirection going to the language chosen by the hl parameter (as it may have been changed by the user from the Accept-language setting, or by having all the links on the page be processed for adding on the current hl parameter.

The SEO issues can be addressed by creating index files for everything like stackoverflow does, these could include multiple sets of indices for the different languages, hopefully encouraging showing up in results for the non-default language.

The use of 1 takes away the differentiator in the URL. The use of 2 and 3 suggest that the page is different, possibly beyond just language, like wikipedia is. And the use of 4 suggests that the server itself is separated, perhaps even geographically.

Because there is a surprisingly poor correlation of geographic location to language preferences, the issue of providing geographically close servers should be left to a proper CDN setup.

dlamblin
+1  A: 

It depends. I would choose number four personally, but many successful companies have different ways of achieving this.

  • Wikipedia uses subdomains for various languages (el.wikipedia.org).
    • So does Yahoo (es.yahoo.com for Spanish), although it doesn't support Greek.
    • So does Gravatar (el.gravatar.com)
  • Google uses a /intl/el/ directory.
  • Apple uses a /gr/ directory (albeit in English and limited to an iPhone page)

It's really up to you. What do you think your customers will like the most?

Alexsander Akers
+4  A: 

I'd choose number 3, redirect to http://example.com/el/folder/page, because:

  1. Language selection is more important than a page selection, thus selected language should go first in a true human-readable URL.
  2. Only one domain gets all Google's PR. That's good for SEO.
  3. You could advert your site locally with a language code built-in. E.g. in Greece you would advert as http://example.com/el/, so every local visitor will get to a site in Greece and would avoid language-choosing frustration.

Alternatively, you can go for number 5: it is fine for Google and friends, but not as nice for a user.

Also, we should refrain to redirect a user anywhere, unless required. Thus, in my mind, a user opening http://example.com/folder/page should get not a redirect, but a page in a default language.

sanmai
In your last paragraph it seems you prefer alternative number 1?
Arne Evertsson
I don't think he does, just that the default URl should deliver in a default language.
Marcus Downing
+1  A: 

None of them. A 'normal user' wouldn't understand (and so remember) any of those abbreviations.

In order of preference I'd suggest:

  1. http://www.domain.gr/folder/page
  2. http://www.domain.com/
  3. http://domain.com/gr/folder/page
Jon Hadley
I wouldn't expect a user to remember them since I would redirect from domain.com to the language specific url.
Arne Evertsson
In that case, option 2, don't display anything the user doesn't need to see.
Jon Hadley
just my two cents, from own experience displaying nothing can be a real pain on some websites. Particularly, websites that redirect you to the home page when you change language. Then you have to retry to find back your page, which is not always easy if you came from an external link. Thus, if I can i like to be able to change the url. The generic use case for this is, particularly on 'government sites' If I want to sent the url to a friend that does not understand the page language.Ok, its a particular use case, and is due to crappy change language behavior.
HeDinges
+1  A: 

3 or 4.

3: Can be easily dealt with using htaccess/mod_rewrite. The only downside is that you'd have to write some method of automatically injecting the language code as the first segment of the URI.

4: Probably the best method. Using host headers, it can all be sent to the same web application/content but you can then use code to extract the language code and go from there.

Simples. ;)

Gavin