views:

296

answers:

4

Hi,

I have a number of database tables that contain name and description columns which need to be localized. My initial attempt at designing a DB schema that would support this was something like:

product
-------
id
name
description


local_product
-------
id
product_id
local_name
local_description
locale_id


locale
------
id
locale

However, this solution requires a new local_ table for every table that contains name and description columns that require localization. In an attempt to avoid this overhead I redesigned the schema so that only a single localization table is needed

product
-------
id
localization_id


localization    
-------
id    
local_name
local_description
locale_id


locale
------
id
locale

Here's an example of the data which would be stored in this schema when there are 2 tables (product and country) requiring localization:

country

id,     localization_id
-----------------------
1,      5

product

id,     localization_id
-----------------------
1,      2

localization

id,     local_name,   local_description,     locale_id
------------------------------------------------------
2,      apple,        a delicious fruit,     2
2,      pomme,        un fruit délicieux,    3
2,      apfel,        ein köstliches Obst,   4
5,      ireland,      a small country,       2
5,      irlande,      un petite pay,         3

locale

id,     locale
--------------
2,      en
3,      fr
4,      de

Notice that the compound primary key of the localization table is (id, locale_id), but the foreign key in the product table only refers to the first element of this compound PK. This seems like 'a bad thing' from the POV of normalization.

Is there any way I can fix this problem, or alternatively, is there a completely different schema that supports localization without creating a separate table for each localizable table?

Update: A number of respondents have proposed a solution that requires creating a separate table for each localizable table. However, this is precisely what I'm trying to avoid. The schema I've proposed above almost solves the problem to my satisfaction, but I'm unhappy about the fact that the localization_id foreign keys only refer to part of the corresponding primary key in the localization table.

Thanks, Don

A: 

The correct way, I feel, would be to create the extra table, but then go the extra step and remove all language specific resources from the first table.

So you'd have:

product

id 
-name removed
-description removed

product localization

productid, locale_id, name,   description
------------------------------------------------------
1,         3,         pomme,  un fruit délicieux
1,         4,         apfel,  ein köstliches Obst
1,         1,         apple,  a delicious fruit

locale

id,     locale
--------------
1,      en
3,      fr
4,      de
DanDan
This schema won't work when there is more than one table containing localizable content
Don
You will need an extra table per localisable content. I don't think you can get around that. Some localisable tables may not have names or descriptions, you see, or may have more fields that require translations. This of course breaks best practices for normalizations. Use the extra tables.
DanDan
In this case, it's safe to assume that *every* localizable table will only require a name and description to be localized
Don
+2  A: 

I think it's fine. You're describing a one-to-many relationship between a product and its localization text.

I'm wondering if you should also localize the english instead of denormalizing it in your product table.

Beth
I've modified the schema to incorporate your suggestion
Don
The whole point of localization is that there is a one-to-many relationship between the "thing" and the name. So it naturally follows that a reference in the "thing" table cannot be a complete primary key to the localization table, because that would constrain the relationship to one-to-one. The more common way to do a many-to-one relationship is to post the identifier of the "one" end into the "many" end, but you'd have a problem doing that here, as the localization table would then have to reference many different tables, making for a messy, loosely-defined foreign key reference.
Jay
+1  A: 

If i understand right, your issue is only because you want to use the same languale localization for name and description in more than one table. In such a scenario, you cannot add the prod_id in the localization table. One more issue in your design is that it cannot handle more than one language localization for the same product elegantly. You could tweak it to work:

If name and description are the only fields that require localization, you can do the following.

Product(ID, name, description, tanslation_row_id)

Product_translations(ID, name, description, lang_id, translation_id)

The translation_row_id will be foreign key pointing to Product_translations.ID The translation_id will, however point a parent record in the same table that would serve as a common record for all language specific records.

Example Records

Product

(ID, name, description, translation_row_id)
(p1, apples,a red fruit, tr1)
(p2, mango, a yellow fruit, tr2)

Product_translations

(ID, name, description, lang_id, translation_id)
(tr1, apples, a red fruit, ENU, null)
(tr2, mango, a yellow fruit, null)
(tr3, pomme,un fruit rouge, FRA,tr1)
(tr4, mangue,a yellow fruit, SPA,tr2)

Given a language code, you can extract the name and description values using the foll SQL query

select T.name, T.description
from product_translations T 
where T.translation_id = 
     (select T2.ID 
      from Product P,Product_translations T2 
      where P.translation_row_id = t2.ID
      ) 
     and T.lang_id = '&langID';

Important Note: I am assuming that the product table has many more attributes that do not need this translation. '&langID' is a parameter for the SQL query that would ask the user the language code of his choice

bkm
Great suggestion, I might give this a try. However, I'll probably remove name and description from the Product (and other localizable) tables, as it seems to be redundant. Also, the product_translations table should be called something more generic like 'translations', because in reality it will contain localized countries, products, etc.
Don
A: 

I like the idea, but would go a step in the other direction, and have a localization entry for every column that is translated:

country

id,     localization_id
-----------------------
1,      5

product

id,     name_locale_id,  description_locale_id
----------------------------------------------
1,      2,               8

localization

id,     locale_id,    value
------------------------------------------------------
2,      2             apple
2,      3             pomme
2,      4             apfel
5,      2             ireland
5,      3             irlande
8,      2             a delicious fruit
8,      3             un fruit délicieux
8,      4             ein köstliches Obst
9,      2             a small country
9,      3             un petite pay

locale

id,     locale
--------------
2,      en
3,      fr
4,      de

The PK of localization is (id, locale_id). It's no problem that id is also a FK reference in several other tables. You could add a surrogate PK if you want, so long as you still have a unique index on (id, locale_id).

The nice thing about this is it's a single localization table, and it works for any table in your schema, regardless of what fields it has (you're not limited to having both name and description of anything that gets localized). The downside is a potential performance hit when using the localization table -- though potentially you could just cache the whole thing for a given locale_id, so when you're looking up entries you just have to look for the given id (since your cache is keyed based on the language already).

You could also consider leaving in default name and description fields in the product table, which would get used in case an entry is missing for the current language, or when entering, the user didn't specify the language. This would also be the case if you're porting an existing app, you'd already have values there (without locale information).

gregmac