views:

399

answers:

4

Internationalizing web apps always seems to be a chore. No matter how much you plan ahead for pluggable languages, there's always issues with encoding, funky phrasing that doesn't fit your templates, and other problems.

I think it would be useful to get the SO community's input for a set of things that programmers should look out for when deciding to internationalize their web apps.

+4  A: 

First, get a good understanding of what internationlization (i18n) and localization (l10n) mean -oh, and don't forget multilingualization (m17n), of course ;-).

Whether you use Perl or not, I strongly recommend reading Localizing Your Perl Programs; it's an enlighting text.

Practically speaking, in many cases, throwing gettext at your code might be both easy and good enough. It's available for almost any language, and you get many existing tools to handle translation, including translator-friendly GUIs.

Nowhere man
Just read that article, very interesting.
Vincent McNabb
A: 

I have a couple apps that are "bilingual" I used resource files in ASP.NET1.1

There is also something called the String Resource Tool Basically you put all your strings in a .RES file for both languages and then determine what file to read from based on Culture or whether someone clicked a Link for the language

The biggest gotcha is making sure the Translations are done correctly

WACM161
+2  A: 

In my company all our strings are stored in *.properties files. Our build tools build a "test languange" copy of the properties files, which replace a string like this:

Click here

with something like this:

[~~~~~~~ Çļïčк н∑ѓё  ~~~~ タヌクウ ~~~~]

Now, when we set the language to "test" in our config files, these properties files are used. (And of course we don't ship the test language files).

This allows us to:

  1. Make sure that Unicode characters are displayed correctly, including Japanese/Chinese/Korean.
  2. Make sure that the layout scales appropriately for languages with longer words (German in particular has longer words on average than English).
  3. Spot any hard-coded strings (as they will be in plain-English).

As for the actual translation, this is done by professional translators, not developers.

Kip
+8  A: 

Internationalization is hard, here's a few things I've learned from working with 2 websites that were in over 20 different languages:

  • Use UTF-8 everywhere. No exceptions. HTML, server-side language (watch out for PHP especially), database, etc.
  • No text in images unless you want a ton of work. Use CSS to put text over images if necessary.
  • Separate configuration from localization. That way localizers can translate the text and you can deal with different configurations per locale (features, layout, etc). You don't want localizers to have the ability to mess with your app.
  • Make sure your layouts can deal with text that is 2-3 times longer than English. And also 50% less than English (Japanese and Chinese are often shorter).
  • Some languages need larger font sizes (Japanese, Chinese)
  • Colors are locale-specific also. Red and green don't mean the same thing everywhere!
  • Add a classname that is the locale name to the body tag of your documents. That way you can specify a specific locale's layout in your CSS file easily.
  • Watch out for variable substitution. Don't split your strings. Leave them whole like this: "You have X new messages" and replace the 'X' with the #.
  • Different languages have different pluralization. 0, 1, 2-4, 5-7, 7-infinity. Hard to deal with.
  • Context is difficult. Sometimes localizers need to know where/how a string is used to make sure it's translated correctly.

Resources:

Ryan Doherty