views:

399

answers:

4

I have a site with many people from around the world. The entire thing is UTF-8 so people are free to submit content and speak in any language they wish - from Greek to English.

Now the only thing that the user can't control is the built in site language used for things like navigation and instructions for registering. These strings are in a language file however and can easily be translated.

So now I need to know how I can build a system that will attempt to auto-detect the users preferred language while still allowing that to be changed.

The easiest way to do this is to check the Accept-Language header the users browser sends. Most browsers will be installed in the language that the user wants - and even then they can change the language in the settings.

The problem is that Internet cafe users might want to change these settings but won't be able to or users may not be competent enough with the web to even know about them. Picture a dumb American tourist on some computer in another country. (I can laugh at them because I am one).

So these people need a better way change the language encase they can't override the browser. To answer that I was thinking that I could implement a URL based system like site.com/en_us/... or site.com/fr_ca/....

Ok, so here is how I would imagine it working.

  • New UserA comes to sight.
  • My site finds no locale cookie set
  • looks at and parses accept-language header
  • checks for existing lang dir (to make sure I support fr_ca!)
  • sets cookie with language as fr_ca
  • redirects user to site.com/fr_ca

  • UserA now loads site.com/fr_ca

  • My site finds locale cookie
  • Locale cookie matches URL locale
  • User must want this locale
  • continue loading page

  • UserB gets link from UserA pointing to site.com/fr_ca

  • UserB loads page and my site finds no locale cookie

Here is where I'm kind of lost as what to do next

  • My site finds UserB browser says en_us so it redirects them or
  • My site creates a locale cookie with fr_ca and the user will have to visit en_US to change that.

Does this seem like a pretty solid way to handle language detection so I know whether to say register or registre?

+1  A: 

I would go with "My site finds UserB browser says en_us so it redirects them", as this would be expected behavior for a new visitor - if you do not have language switchers anywhere.

On the other hand, if you do provide somewhere a well-visible block of language switchers - then I'd use "My site creates a locale cookie with FR and the user will have to visit EN to change that", because "wrong language" can be easily switched.

Overall, I'd recommend implementing visible language switches - this is the most reliable fallback if everything else fails and your French visitor sees a website in Greek.

Then these roles could be assigned to every level of language negotiation:

  1. (lowest priority) cookie reports visitor's preference
  2. when there is no cookie, accept-language tells which language to use
  3. when there is no accept-language (or you do not support it), use default language
  4. (highest priority) when there is a language specifier in the URL - use that, no matter what are the values of cookies/accept-language
chronos
A: 

Keep in mind, there is little issue with Arabic languages as they write from right to left, so you have to define write direction in most tags. Also I would make language parameter lest strict, instead of using en_US or en__GB use en parameter as there is not much difference between them.

Nazariy
That is a simple CSS fix which could be enabled by the locale - so I don't think it's a problem.
Xeoncross
+1  A: 

Here's the way i do it...

Always have a place on the site where the user can switch languages. The language buttons should load the same page they are currently on, and set a cookie to that language.

To test for which language to load...

// cookie exists, they know what language they want
if cookie exists load that language

// cookie doesn't exist - its expired or this is their first visit
elseif accept-language is set, load that language; set cookie

// no accept-language present
else load some default language; set cookie

Say an english speaker in france comes to your site.

  • no cookie is present, but accept-language is french.

  • french is loaded, cookie is set.

  • user sees language switcher and chooses english. english loads and a cookie is set for his remaining time on the site.

If cookies aren't accepted on the browser you pass a lang=[lang] in the query string and check for that after you check for the cookie.

Galen
good outline, the benefit to your idea is that I don't have to mess with URL stuff and so if the user wanted to translate the page content also - then google translate would work too since there is only one URL set.
Xeoncross
It's not "restful" but i found that adding /lang/ to the urls complicated things a lot.
Galen
A: 

I would make the language indicator as transparent as possible. That means I’d put it in the URL like you suggested and not in a cookie. Located in the URL there the user can easily change it if necessary.

Now if the language indicator is missing, you can do the language negotiation on Accept-Language or other criteria and redirect the user to that language specific representation of the requested resource.

Additionally, you could do the language negotiation on every request and compare the requested language in the URL with the one that’s been negotiated. If they differ, you could present a message that a representation in the preferred language is also available.

Gumbo
What about if I used a `en.domain.com` format instead of `domain.com/en`. That might help to settle the difference in the users mind a little better and it would also allow cleaner URL's when using frameworks like MicroMVC and CodeIgniter.
Xeoncross
Wait, to use the `en.domain.com` format the sever would have to have vhosts setup which might not be possible on shared hosts. So scratch that.
Xeoncross