views:

80

answers:

3

Hi all, I'm doing some i18n on a web-based app using Django, which uses gettext as its i18n foundation. It seems like an obvious idea that translations should be stored in the database, and not difficult to do, but po files on the filesystem are still being used. Why is this?

My current suspicion is that the benefits of developing a db backaged are simply outweighed by the reliability/familiarity of gettext as a well-established package. Are there other significant reasons for continuing to store the translations on the filesystem?

A: 

This seems like an obvious idea for you, I don't think everybody will agree. AFAIK django uses .po files for following reasons:

  • Version control - you will have to create additional ".po to database" tools, because you still need to maintain different people working on translations, and you can't get away from having .po files for that purpose
  • gettext is a standart way of doing translations in .nix world, there are many tools for working with it and it's simple to edit, diff, etc.
  • No need to hit database if you need to translate anything. Some views can work without any db requests, so no need to tie them to database just to get translation. (I may be wrong, but in case of mod_wsgi - translations will be loaded once and stored in memory for every thread).

Btw, if you need to have different translations for fields, it's a bit different question and you should check http://www.muhuk.com/2010/01/dynamic-translation-apps-for-django/ and choose app that best fit your needs.

Riz
A: 

It's a very common way to do translations that has been around for a long time allowing any issues to be ironed out over the years. I imagine writing something like gettext it would be all too easy to make incorrect generalisations about how languages work. Why should the Django development team spending time researching that and developing it when it's already been done in a tried and tested system? Furthermore professional translators probably know what to do with PO files where as a home-brew translation database may prevent them from working in ways they're used to.

Why would you prefer translations in a database? I guess you might prefer it as you could make a translation interface to the database. If that's the case have a look at Pootle it's a powerful web-based translation interface that works directly with PO files and can even integrate with common version control systems. Add some post-commit hooks and you can have such a system with little work and without the overhead of a translations database.

Hope that helps.

StephenPaulger
+1  A: 

Performance is the main reason. Gettext is not using a database because a database will always be considerably slower than a file. The load time of the dictionary is very important and for this reason almost everyone is using files for that.

Also, the compiled gettext files (.mo) are optimized for loading in memory and for this reason they are more appropriate than plain text files (like not-compiled .po files).

You can always use translation platform, probably that uses a database backend, for doing the translation and export the results to text files. Examples: Pootle, Narro, Launchpad Rosetta, Transifex (hosted only).

Do not confuse your application language files with the localization database. Your application should use file based dictionaries that are fast to load and your localization system probably will have to use a database and logically be able to export data to files.

By the way, using gettext is probably the best technological decision you may be able to make regarding localization. I never seen any commercial solution or in-house developed to be able to compete with it on features, tools and even support.

Sorin Sbarnea
"a database will always be considerably slower than a file" [citation-needed] - esp. for in-memory tables and a IO-busy disk, this tends to be false.
Piskvor
You forget to consider that this file can be loaded in memory too. And `.mo` file is especially better because it does already contain the hash table so you do not even have to hash the strings when you load it.
Sorin Sbarnea