views:

74

answers:

2

Before you start to read, here is a related question, however I am afraid it lacks details...

My situation is very simple, I am working on website which is translated in a number of languages. We have variants of the website customized to clients needs and they opt for the languages that best target their intended audience.

However, we have a problem with the translation process. The problem is not about versioning (we are programmers after all) or the database layout, I am really speaking about managing translation, the cost, and the mishaps.

I think we have identified 3 steps in the process so far.

1. Preparing the documents to be sent
Typically, you want this to be automatic. It's a pain to waste human resources on this recurrent task

2. Translating
It is not so easy to actually translate, there are various (and open) problems there: the lack of context for example since natural languages can be ambiguous and the use of synonyms does not help (especially since we are required to give English as a base language and we are a French team so we may not use idioms)

3. Integration / Feedback
Typical problem is a single word (used for a button) translated to a whole sentence, oups my layout gave way... we ask the translators not to increase the length by more than 25% but sometimes even this may break all hell and sometimes it really restrict them when there is space.

I would like to know how YOU handle this important step of internationalization. We have several brainstorming on this but nothing conclusive got out.

I'll give you our current process in an answer below, so that the question and the answer may be voted / commented independently.

Regards.

A: 

Here is our current process.

1. Documents

We send 2 documents:
+ An Excel spreadsheet, comprising 1 column for an 'id', 1 column for the English translation and then 1 column for each language. A 'NOT TRANSLATED' string is used when necessary and we reuse the strings we already have. A script is responsible for indexing the strings and creating this document (generates a .csv which we import).
+ A PowerPoint document comprising various screenshots of the application made with a 'fake' translation file which includes the 'id' of each string, it is used to give context to the translators and so they have a rough estimate of the space available.

The problem here lies within the PowerPoint document, each time there are changes a human intervention is needed to update the screenshots. Furthermore several screens may actually vary because of conditional or mutually exclusive elements (so we have to generate some variations of the screens).

2. Translation

Apart from the fact that it takes time, we do not have much visibility since the client is responsible for the wording (we only provide a default and they have to adapt it to their own vocabulary / branding) and thus the translation.

Apparently the PowerPoint document is used, but perhaps not as much as we could hope for. The translation teams seem to work as quickly as possible and referring to a document (with the time it takes to look up) not only breaks the flow but adds significant overhead.

3. Integration / Feedback

More often than not, the client is not so happy with the translations. Some may break the layout, other are poorly worded and clearly expose a lack of understanding of the context/situation/field (not that there is much we can do there, I for one do not speak Finnish...)

We have thought about our process, and two points have been raised:

  • The human intervention required to update the screen shots is costly
  • It would be beneficial if the translators had a more immediate feedback

Despite that, we have had several ideas:

  • The PowerPoint is killing us, but the translators need context (well, we could hope that they use it more...) we have thought about embedding a 'context' column in the Excel spreadsheet to give this context, but it seems a poor substitute
  • There has been propositions to replace the PowerPoint document by a 'living' demo of the website. The idea being to send a static replica of the website with some variations for the screens that need it. However how this replica is built and powered is another question, we don't want a human intervention to update the screens...
  • If we could completely automate the 3rd step, we could perhaps provide a sandbox to the translators / client so that they could work on the translations in a tighter loop.

As you can see, it really is an open question, and yes we really have got a lot of thought into this. To compound the problem, of course there is a budget (money / time) issue in which this task does not appear as a priority to our management since after all it does not work so badly at the moment...

I am looking forward to your comments.

Matthieu M.
+1  A: 

I have some experience on this as one of the product my company ships supports more than 20 languages. There are 2 aspect:

  • handling translation on the shipped software
  • in-house phrase database management

for handling translation on the shipped software, we use a phrase id in the code when a phrase is used. And at run-time, the s/w could look up the actual phrase string according to the language id and phrase id, and load it. There's always a default language(e.g. English) available when the look-up fails(e.g. Chinese phrase db file missing while trying to load Chinese phrases). If the translated phrase requires more room than the reserved space, we have a custom control that implement this effect: turn the font color to blue and wrap the text with ending an ellipsis, when the customer clicks the text, a tooltip window showing the full text pops.

for in-house phrase database management, we developed a web site portal, a developer may look up there to see whether a translation is available or not. if yes, he grabs the phrase id and go on with his project. if not, he applies for a new phrase id and goes back to his project with that id without be blocked by the actual translation. During the application, the developer could add contexts such as picture, doc file, simple text as so on. The site would send translation request to the translator and email the admin for notice of such requests. Everything is automatic.

The site also generates translation db files either regularly or on request to be included in the shipped s/w.

t.g.
Very nice ideas indeed. Do you have some kind of notification for the 'too long' sentences (or perhaps upfront check before putting it in the database). I like the idea of a central repository of translations, at the moment we still have some difficulty knowing what has been added since last translation (we do have the diffs, but administration would be better).
Matthieu M.
There's no need for the notification of being too long in my case. The phrase could be used anywhere, so the length of the string only makes sense when the context such as the font is determined. At runtime we use APIs like GetTextExtentPoint32() to compare the text length with the containing control. Regarding difference, I think tracking is better than diff-ing. By tracking I mean you attach one unique ID for each phrase, and keep all change history to this ID. It's like a DB record with many Null-able fields, each for one language, translate to that language only when needed.
t.g.