views:

323

answers:

9

How should I store (and present) the text on a website intended for worldwide use, with several languages? The content is mostly in the form of 500+ word articles, although I will need to translate tiny snippets of text on each page too (such as "print this article" or "back to menu").

I know there are several CMS packages that handle multiple languages, but I have to integrate with our existing ASP systems too, so I am ignoring such solutions.

One concern I have is that Google should be able to find the pages, even for foreign users. I am less concerned about issues with processing dates and currencies.

I worry that, left to my own devices, I will invent a way of doing this which work, but eventually lead to disaster! I want to know what professional solutions you have actually used on real projects, not untried ideas! Thanks very much.

+1  A: 

If you are using .Net, I would recommend going with one or more resource files (.resx). There is plenty of documentation on this on MSDN.

SaaS Developer
+2  A: 

You might want to check GNU Gettext project out - at least something to start with.

Edited to add info about projects:

I've worked on several multilingual projects using Gettext technology in different technologies, including C++/MFC and J2EE/JSP, and it worked all fine. However, you need to write/find your own code to display the localized data of course.

Michael Pliskin
+1  A: 

If you're just worried about the article content being translated, and do not need a fully integrated option, I have used google translation in the past and it works great on a smaller scale.

Dr. Bob
Did you ever read a google-translated text in your native tongue? Not really a professional option.
OregonGhost
I have used Googles translations as a temporary solution, and have had positive comments back from 'foreigners'! Admitedly I was surprised at these!
Magnus Smith
A: 

I looked at RESX files, but felt they were unsuitable for all but the most trivial translation solutions (I will elaborate if anyone wants to know).

Google will help me with translating the text, but not storing/presenting it.

Has anyone worked on a multi-language project that relied on their own code for presentation?

Magnus Smith
+1  A: 

As with most general programming questions, it depends on your needs.

For static text, I would use RESX files. For me, as .Net programmer, they are easy to use and the .Net Framework has good support for them.

For any dynamic text, I tend to store such information in the database, especially if the site maintainer is going to be a non-developer. In the past I've used two approaches, adding a language column and creating different entries for the different languages or creating a separate table to store the language specific text.

The table for the first approach might look something like this:

Article Id | Language Id | Language Specific Article Text | Created By | Created Date

This works for situations where you can create different entries for a given article and you don't need to keep any data associated with these different entries in sync (such as an Updated timestamp).

The other approach is to have two separate tables, one for non-language specific text (id, created date, created user, updated date, etc) and another table containing the language specific text. So the tables might look something like this:

First Table: Article Id | Created By | Created Date | Updated By | Updated Date

Second Table: Article Id | Language Id | Language Specific Article Text

For me, the question comes down to updating the non-language dependent data. If you are updating that data then I would lean towards the second approach, otherwise I would go with the first approach as I view that as simpler (can't forget the KISS principle).

Rich McCollister
A: 

Any thoughts on serving up content in the following ways, and which is best?

(these are not real URLs, i was just showing examples)

Magnus Smith
+5  A: 

Firstly put all code for all languages under one domain - it will help your google-rank.

We have a fully multi-lingual system, with localisations stored in a database but cached with the web application.

Wherever we want a localisation to appear we use:

<%$ Resources: LanguageProvider, Path/To/Localisation %>

Then in our web.config:

<globalization resourceProviderFactoryType="FactoryClassName, AssemblyName"/>

FactoryClassName then implements ResourceProviderFactory to provide the actual dynamic functionality. Localisations are stored in the DB with a string key "Path/To/Localisation"

It is important to cache the localised values - you don't want to have lots of DB lookups on each page, and we cache thousands of localised strings with no performance issues.

Use the user's current browser localisation to choose what language to serve up.

Keith
see http://msdn.microsoft.com/en-us/library/ms227427.aspx for further info
Magnus Smith
This seems to me a well-thought solution. Got my vote.
OregonGhost
A: 

Wonderful question.

I solved this problem for the website I made (link in my profile) with a homemade Python 3 script that translates the general template on the fly and inserts a specific content page from a language requested (or guessed by Apache from Accept-Language).

It was fun since I got to learn Python and write my own mini-library for creating content pages. One downside was that our hosting didn't have Python 3, but I made my script generate static HTML (the original one was examining User-agent) and then upload it to server. That works so far and making a new language version of the site is now a breeze :)

The biggest downside of this method is that it is time-consuming to write things from scratch. So if you want, drop me line and I'll help you use my script :)

ilya n.
A: 

As for the URL format, I use site.com/content/example.fr since this allows Apache to perform language negotiation in case somebody asks for /content/example and has a browser tell that it likes French language. When you do this Apache also adds .html or whatever as a bonus.

So when a request is for example and I have files

example.fr
example.en
example.vi

Apache will automatically proceed with example.vi for a person with Vietnamese-configured browser or example.en for a person with German-configured browser. Pretty useful.

ilya n.

related questions