views:

951

answers:

3

I want to make a multi-language site, such that all or almost all pages will be available in 2 or more translations. What are the best practices to follow?

For example, I consider these language selection mechanisms:

  1. Cookie-based selection of the preferred language.
  2. Based on Accept-Language header if the cookie is not set.
  3. Based on GeoIP otherwise (probably).

Is there anything else?

How should different translations be served?

  1. as LANG.example.com/page
  2. as example.com/LANG/page
  3. as example.com/page?hl=LANG
  4. ...
  5. any of the above with a redirect to example.com/page? (It seems to be discouraged)

How to ensure that all the translations are properly indexed?

  1. Sitemaps with all pages + correct Content-Language header are enough?

What is the best way to let the users know there are other translations, but do not distract them?

  1. list available languages in the header/footer/sidebar (like Wikipedia)
  2. put “Choose a language” selector next to the content

What is the best policy to deal with missing/outdated translations?

  1. do not display missing pages at all or display a page in a different language?
  2. display old translation, old translation with a warning or a page in a different language?

What else should I take into account? What should I do and what I definitely should not?

+3  A: 

Make the decision whether you need support for languages that require double byte characters early on (Chinese, Japanese, Korean, etc), Unicode is the preferable choice. It can be tedious to change later, especially if you have a database that doesn't use unicode.

Fredriku73
Yes, I am going to use Unicode. Thank you!
jetxee
+3  A: 
  1. Cookie-based selection of the preferred language.
  2. Based on Accept-Language header if the cookie is not set.

These two you should support.

Put a big english banner at the top of your page that reads This page in English.

as example.com/LANG/page

This is the best choice.

LANG.example.com isn't good for autocomplete, and the question marks look ugly.

list available languages in the header/footer/sidebar (like Wikipedia)

Choose a language dropbox is confusing, as it is not intelligible being written in a wrong foreign language and spoils overall impression being written in English.

And you always tend to make the error selecting the language you don't even have fonts for leaving yourself on a page full of question marks.

display old translation with a warning

You know there is something you can read and get the point, but for the details you'd better get a dictionary and read it in English.

Quassnoi
A point about autocomplete is very interesting. It didn't come to my mind. Thank you for a good answer!
jetxee
+4  A: 

In addition to @Quassnoi's answers ensure that you standard RFC 4646 language identifiers (e.g. EN-US, DE-AT); you may already be aware of this. The CLDR project is an excellent repository of internationalization data (the Supplemental Data is really useful).

If a translation of a specific page is not available, use a language fallback mechanism back to the neutral language; for example "DE-AT", "DE", "" (neutral, e.g. "EN").

Most recent browsers and the underlying operating systems will correctly show all of the characters required for a locale selector list if the page is encoded correctly (I'd recommend all pages being UTF-8). Ensure that the locale list contains both the native and current-language names to allow both native and non-native speakers to view the specified translations, e.g. "Deutsch (German)" if the current locale is EN-*.

A lot of sites use a flag icon to show the current locale, but this is more relevant to the location and some people may be offended if you show only a dominant flag (e.g. the US or UK flag for English).

It may be worthwhile to have a more visible (semi-graphical) locale selector on the home page if no locale cookie has been submitted, using a combination of GeoIP and Accept-Language to determine the default locale choice.

Semi-related: if your users are in located in different time zones include a zone preference in their account profile for displaying time values in their local time. And store all time stamps using UTC.

devstuff
Very interesting issues. I'll follow your advises. Thank you!
jetxee