views:

1061

answers:

4

Lets say I have the following Django model:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    label = models.CharField(max_length=255)
    abbreviation = models.CharField(max_length=255)

Each label has an ID number, the label text, and an abbreviation. Now, I want to have these labels translatable into other languages. What is the best way to do this?

As I see it, I have a few options:

1: Add the translations as fields on the model:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    label_english = models.CharField(max_length=255)
    abbreviation_english = models.CharField(max_length=255)
    label_spanish = models.CharField(max_length=255)
    abbreviation_spanish = models.CharField(max_length=255)

This is obviously not ideal - adding languages requires editing the model, the correct field name depends on the language.

2: Add the language as a foreign key:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    label = models.CharField(max_length=255)
    abbreviation = models.CharField(max_length=255)
    language = models.ForeignKey('languages.Language')

This is much better, now I can ask for all labels with a certain language, and throw them into a dict:

labels = StandardLabel.objects.filter(language=1)
labels = dict((x.pk, x) for x in labels)

But the problem here is that the labels dict is meant to be a lookup table, like so:

x = OtherObjectWithAReferenceToTheseLabels.object.get(pk=3)
thelabel = labels[x.labelIdNumber].label

Which doesn't work if there is a row per label, possibly with multiple languages for a single label. To solve that one, I need another field:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    group_id = models.IntegerField(db_index=True)
    label = models.CharField(max_length=255)
    abbreviation = models.CharField(max_length=255)
    language = models.ForeignKey('languages.Language')
    class Meta:
        unique_together=(("group_id", "language"),)
#and I need to group them differently:
labels = StandardLabel.objects.filter(language=1)
labels = dict((x.group_id, x) for x in labels)

3: Throw label text out into a new model:

class StandardLabel(models.Model):
    id = models.AutoField(primary_key=True)
    text = models.ManyToManyField('LabelText')

class LabelText(models.Model):
    id = models.AutoField(primary_key=True)
    label = models.CharField(max_length=255)
    abbreviation = models.CharField(max_length=255)
    language = models.ForeignKey('languages.Language')

labels = StandardLabel.objects.filter(text__language=1)
labels = dict((x.pk, x) for x in labels)

But then this doesn't work, and causes a database hit every time I reference the label's text:

x = OtherObjectWithAReferenceToTheseLabels.object.get(pk=3)
thelabel = labels[x.labelIdNumber].text.get(language=1)

I've implemented option 2, but I find it very ugly - i don't like the group_id field, and I can't think of anything better to name it. In addition, StandardLabel as i'm using it is an abstract model, which I subclass to get different label sets for different fields.

I suppose that if option 3 /didn't/ hit the database, it's what I'd choose. I believe the real problem is that the filter text__language=1 doesn't cache the LabelText instances, and so the DB is hit when I text.get(language=1)

What are your thoughts on this? Can anyone recommend a cleaner solution?

Edit: Just to make it clear, these are not form labels, so the Django Internationalization system doesn't help.

+2  A: 

I would keep things as simple as possible. The lookup will be faster and the code cleaner with something like this:

class StandardLabel(models.Model):
    abbreviation = models.CharField(max_length=255)
    label = models.CharField(max_length=255)
    language = models.CharField(max_length=2)
    # or, alternately, specify language as a foreign key:
    #language = models.ForeignKey(Language)

    class Meta:
        unique_together = ('language', 'abbreviation')

Then query based on abbreviation and language:

l = StandardLabel.objects.get(language='en', abbreviation='suite')
Daniel
I would rather load all of the labels into memory once to only hit the DB once. I have several objects with id numbers that will be referencing these labels. I'd use a foreign key to labels if i could, but then that locks in a language unless I go with option 3, which still hits the db.
The_OP
I think you're overthinking things. If you need to cache data, then just do it: `StandardLabel.objects.all().values()`. If you're concerned about additional database queries, then use the `select_related` parameter.
Daniel
+1. Simple should be the way to go. Translation is already a complicated task.
muhuk
Good suggestion on that cache, but I was under the impression that getting .all() and then later filtering (.get(pk=4)) wouldn't reuse the cache for all types of filtering.
The_OP
.values() returns a dict-like object. You would then use normal python dict notation ie `mydict['mykey']` to access the values.
Daniel
I meant the cache when things eventually do have ForeignKeys: Other.the_label_fk_field.abbreviation - is that one populated by evaluating the query?
The_OP
Nevermind, select_related would do that for me. The reason I had been thinking it wouldn't was I had implemented labels as option 2, a ManyToManyField.
The_OP
+2  A: 

Another option you might consider, depending on your application design of course, is to make use of Django's internationalization features. The approach they use is quite common to the approach found in desktop software.

I see the question was edited to add a reference to Django internationalization, so you do know about it, but the intl features in Django apply to much more than just Forms; it touchs quite a lot, and needs only a few tweaks to your app design.

Their docs are here: http://docs.djangoproject.com/en/dev/topics/i18n/#topics-i18n

The idea is that you define your model as if there was only one language. In other words, make no reference to language at all, and put only, say, English in the model.

So:

class StandardLabel(models.Model):
    abbreviation = models.CharField(max_length=255)
    label = models.CharField(max_length=255)

I know this looks like you've totally thrown out the language issue, but you've actually just relocated it. Instead of the language being in your data model, you've pushed it to the view.

The django internationalization features allow you to generate text translation files, and provides a number of features for pulling text out of the system into files. This is actually quite useful because it allows you to send plain files to your translator, which makes their job easier. Adding a new language is as easy as getting the file translated into a new language.

The translation files define the label from the database, and a translation for that language. There are functions for handling the language translation dynamically at run time for models, admin views, javascript, and templates.

For example, in a template, you might do something like:

<b>Hello {% trans "Here's the string in english" %}</b>

Or in view code, you could do:

# See docs on setting language, or getting Django to auto-set language
s = StandardLabel.objects.get(id=1)
lang_specific_label = ugettext(s.label)

Of course, if your app is all about entering new languages on the fly, then this approach may not work for you. Still, have a look at the Internationalization project as you may either be able to use it "as is", or be inspired to a django-appropriate solution that does work for your domain.

Jarret Hardie
I have looked at this, but unfortunately it's not exactly what I want to do. I would really rather the translations live in the database, for various reasons. Thanks for your help anyway!
The_OP
-1 On any reasonably dynamic website, you can't use gettext for translating database content; it changes far too often.
Carl Meyer
A: 

Although I would go with Daniel's solution, here is an alternative from what I've understood from your comments:

You can use an XMLField or JSONField to store your language/translation pairs. This would allow your objects referencing your labels to use a single id for all translations. And then you can have a custom manager method to call a specific translation:

Label.objects.get_by_language('ru', **kwargs)

Or a slightly cleaner and slightly more complicated solution that plays well with admin would be to denormalize the XMLField to another model with many-to-one relationship to the Label model. Same API, but instead of parsing XML it could query related models.

For both suggestions there's a single object where users of a label will point to.

I wouldn't worry about the queries too much, Django caches queries and your DBMS would probably have superior caching there as well.

muhuk
This is a decent solution, but I'd like one integrated into the admin UI (I know, i'm spoiled :) ).
The_OP
+1  A: 

I'd much prefer to add a field per language than a new model instance per language. It does require schema alteration when you add a new language, but that isn't hard, and how often do you expect to add languages? In the meantime, it'll give you better database performance (no added joins or indexes) and you don't have to muck up your query logic with translation stuff; keep it all in the templates where it belongs.

Even better, use a reusable app like django-transmeta or django-modeltranslation that makes this stupid simple and almost completely transparent.

Carl Meyer
Thank you very much! These two apps look very promising - now I just need to decide which I want to use :)
The_OP
Yeah, that's a tough call. Personally I prefer some things about django-transmeta (like how it gets rid of the original field so you don't need complex logic about what's in it), but I love that django-modeltranslation allows me to translate models without touching their code.
Carl Meyer