views:

222

answers:

5

I'm developing an application which has a lot of text and also different modules which can be included or not in every build.

For each saved project we generate automatically a report with all the details (i.e. description of algorithms used in that project and so on). Currently we embed all text as strings in the source code and we also support different languages through po and mo files.

The good points of the system is that it is very easy to dynamically generate documentation and report files. The bad point is that having a lot of text in source code is ugly and the formatting (i.e. with html) is not comfortable, editing of the text is difficult, no easy spell check and terrible to translate.

So, the final question is: whould you rather embed documentation in code or write external documentation files (for example html) for different languages and parse them on runtime? Obviously the core text of the software, such us message boxes will stay in code anyway.

If it matters, I'm working in C++ with wxWidgets.

+5  A: 

I think all text which may change between different versions of the code should be kept in separate property files. You can build a mechanism which maps message ids to the proper string from a property file, say map id 15 to "search" or to "busca" in the English and Spanish property files respectively. So a property file may be an XML or a CSV with id-message pairs. When running your program, you supply it with the property file(s) as parameters. When it starts, it first loads the property strings into a map, and then you will use property[15] instead of the string "search". Of course, you can use a textual label instead of a numerical id. I would also consider generating the documentation from the property files automatically, maybe using CSS. This makes it a lot easier to edit and translate the messages.

Yuval F
+1  A: 

If I have lots of text to display, I typically store it in XML outside the application and read it as needed. This would work well for documentation, as well, I think. You could simply have a separate stylesheet to produce documentation from it. Localizing your application would then become a matter of maintaining the alternate translations separately -- could be done either internally as separate nodes in the XML file or organizationally by maintaining different XML files for each language.

While this approach will cause your program to take more time to start, I think the customer eventually wins because:

  1. Your program text is cleaner (and easier to maintain)
  2. You aren't forced to modify code to change text
  3. You can support many more translations easily, making your product available to more people.
tvanfosson
+1  A: 

The other answers hit on the important points so I will just point this out:

If you are doing simple one to one pairs, as in:

#textId  "the actual text"

Then XML is overkill. It will be slower to parse, and larger on disk. Something like a CSV or even a very simple custom format would probably be best.

Adam
A: 

From personal experience with using different languages I found the special textfile for each country the best solution. You have to be careful with the differences in length for the same concept in different languages. You can also print the textfile and give it to a translator way before you install the software. Each text or part of text has a number as a key and can be used as required, moved together etc. You can also provide special fields to insert variable data into the text, if you use just one Subroutine to display or print the text where it is needed. I myself installed the same software this way in 7 different languages all over the world.

A: 

I just use a simple tab delimited text file. It can be loaded into Excel and edited very easily. It can also be in any format like iso-8859-1, utf-8 or utf-16 etc. The first column is for the ID and then each subsequent column is for a language.

I then run the text file through a pre-processor that generates a list of enums in a text.h file and you could generate a binary file for the text, or a cpp file and include it right into the binary. You can also change the encoding at this point to match what you need in your program.

The benefits are:

If a string is missing then you get a compile error.

When you add a new string you don't have to worry about forgetting to add it for all languages.

The downsides are:

You must recompile after editing your text file.

It is more difficult to merge changes made by many users since most diff tools only work on a line by line basis.

KPexEA