views:

2542

answers:

8

A friend of mine is now building a web application with J2EE and Struts, and it's going to be prepared to display pages in several languages.

I was told that the best way to support a multi-language site is to use a properties file where you store all the strings of your pages, something like:

welcome.english = "Welcome!"
welcome.spanish = "¡Bienvenido!"
...

This solution is ok, but what happens if your site displays news or something like that (a blog)? I mean, content that is not static, that is updated often... The people that keep the site have to write every new entry in each supported language, and store each version of the entry in the database. The application loads only the entries in the user's chosen language.

How do you design the database to support this kind of implementation?

Thanks.

+4  A: 

They way I have designed the database before is to have an News-table containing basic info like NewsID (int), NewsPubDate (datetime), NewsAuthor (varchar/int) and then have a linked table NewsText that has these columns: NewsID(int), NewsText(text), NewsLanguageID(int). And at last you have a Language-table that has LanguageID(int) and LanguageName(varchar).

Then, when you want to show your users the news-page you do:

SELECT NewsText FROM News INNER JOIN NewsText ON News.NewsID = NewsText.NewsID
WHERE NewsText.NewsLanguageID = <<Session["UserLanguageID"]>>

That Session-bit is a local variable where you store the users language when they log in or enters the site for the first time.

Espo
A: 

@Espo: Your solution seems to be a good approach. Thank you.

Auron
+7  A: 

Warning: I'm not a java hacker, so YMMV but...

The problem with using a list of "properties" is that you need a lot of discipline. Every time you add a string that should be output to the user you will need to open your properties file, look to see if that string (or something roughly equivalent to it) is already in the file, and then go and add the new property if it isn't. On top of this, you'd have to hope the properties file was fairly human readable / editable if you wanted to give it to an external translation team to deal with.

The database based approach is useful for all your database based content. Ideally you want to make it easy to tie pieces of content together with their translations. It only really falls down for all the places you may want to output something that isn't out of a database (error messages etc.).

One fairly old technology which we find still works really well, is to use gettext. Gettext or some variant seems to be available for most languages and platforms. The basic premise is that you wrap your output in a special function call like so:

echo _("Please do not press this button again");

Then running the gettext tools over your source code will extract all the instances wrapped like that into a "po" file. This will contain entries such as:

#: myfolder/my.source:239
msgid "Please do not press this button again"
msgstr ""

And you can add your translation to the appropriate place:

#: myfolder/my.source:239
msgid "Please do not press this button again"
msgstr "s’il vous plaît ne pas appuyer sur le bouton ci-dessous à nouveau"

Subsequent runs of the gettext tools simply update your po files. You don't even need to extract the po file from your source. If you know you may want to translate your site down the line, then you can just use the format shown above (the underscored function) with all your output. If you don't provide a po file it will just return whatever you put in the quotes. gettext is designed to work with locales so the users locale is used to retrieve the appropriate po file. This makes it really easy to add new translations.

Gettext Pros

  • Doesn't get in your way while coding
  • Very easy to add translations
  • PO files can be compiled down for speed
  • There are libraries available for most languages / platforms
  • There are good cross platform tools for dealing with translations. It is actually possible to get your translation team set up with a tool such as poEdit to make it very easy for them to manage translation projects

Gettext Cons

  • Solves your site "furniture" needs, but you would usually still want a database based approach for your database driven content

For more info on gettext see this wikipedia page

reefnet_alex
A: 

@reefnet_alex: I dindn't know about gettext and seems pretty interesting, so thank you for your contribution. The point is, I don't know if this can be applied to web applications.

Auron
gettext can be applied to web apps. We use it for many of our large websites at my company.
Ryan Doherty
+1  A: 

@Auron

thats what we apply it to. Our apps are all PHP, but gettext has a long heritage.

Looks like there is a good Java implementation

reefnet_alex
+1  A: 

Java web applications support internationalization using the java standard tag library.

You've really got 2 problems. Static content and dynamic content.

for static content you can use jstl. It uses java ResourceBundles to accomplish this. I managed to get a Databased backed bundle working with the help of this site.

The second problem is dynamic content. To solve this problem you'll need to store the data so that you can retrieve different translations based on the user's Locale. (Locale includes Country and Language).

It's not trivial, but it is something you can do with a little planning up front.

ScArcher2
+1  A: 

Tag libraries are fine if you're using JSP, but you can also achieve I18N using a template-based technology such as FreeMarker.

Andrew Swan
A: 

I am seeking multilanguage support for ASP.net technology websites. please help, if someone has brilliant idea...

I recommend you to ask this as a new question (http://stackoverflow.com/questions/ask).
Auron