views:

35

answers:

1

Hi,

Soon I'll be working on catalog(php+mysql) that will have multilang content support. And now I'm considering the best approach to design the database structure. At the moment I see 3 ways for multilang handling:

1) Having separate tables for each language specific data, i.e. schematicly it'll look like this:

  • There will be one table Main_Content_Items, storing basic data that cannot be translated like ID, creation_date, hits, votes on so on - it will be only one and will refer to all languages.

And here are tables that will be dublicated for each language:

  • Common_Data_LANG table(example: common_data_en_us) (storing common/"static" fields that can be translated, but are present for eny catalog item: title, desc and so on...)
  • Extra_Fields_Data_LANG table (storing extra fields data that can be translated, but can be different for custom item groups, i.e. like: | id | item_id | field_type | value | ...) Then on items request we will look in table according to user/default language and join translatable data with main_content table.

Pros:

  • we can update "main" data(i.e. hits, votes...) that are updated most often with only one query
  • we don't need o dublicate data 4x or more times if we have 4 or more languages in comparison with structure using only one table with 'lang' field. So MySql queries would take less time to go through 100000(for example) records catalog rather then 400000 or more

Cons:

  • +2 tables for each language

2) Using 'lang' field in content tables:

  • Main_Content_Items table (storing basic data that cannot be translated like ID, creation_date, hits, votes on so on...)
  • Common_Data table (storing common/"static" fields that can be translated, but are present for eny catalog item: | id | item_id | lang | title | desc | and so on...)
  • Extra_Fields_Data table (storing extra fields data that can be translated, but can be different for custom item groups, i.e. like: | id | item_id | lang | field_type | value | ...) So we'll join common_data and extra_fields to main_content_items according to 'lang' field.

Pros:

  • we can update "main" data(i.e. hits, votes...) that are updated most often with only one query
  • we only 3 tables for content data

Cons:

  • we have custom_data and extra_fields table filled with data for all languages, so its X time bigger and queries run slower

3) Same as 2nd way, but with Main_Content_Items table merged with Common_Data, that has 'lang' field:

Pros:

  • ...?

Cons:

  • we need to update update "main" data(i.e. hits, votes...) that are updated most often with for every language
  • we have custom_data and extra_fields table filled with data for all languages, so its X time bigger and queries run slower

Will be glad to hear suggestions about "what is better" and "why"? Or are there better ways?

Thanks in advance...

A: 

I've given a similar anwer in this question and highlighted the advantages of this technique (it would be, for example, important for me to let the application decide on the language and build the query accordingly by only changing the lang parameter in the WHERE clause of the SQL query.

This get's pretty close to your second solution. I didn't quite got the "extra_fields" but if it makes sense, you could(!) merge it into the common_data table. I would advise you against the first idea since there will be too many tables and it can be easy to lose track about the items in there.

To your edit: I still consider the second approach the better one (it's my optinion so it's relative ;)) I'm no expert on optimization but I think that with proper indexes and proper table structure speed should be not be a problem. As always, the best way to find the most effective way is doing both methods and see which is best since speed will vary from data, structure, ....

DrColossos
Thanks for answer, DrColossos.I made a mistake describing 1 method, so HAVE A LOOK AT FIXED VARIANT. Well, the main goal(that I really like) about the first method is that we don't mix up different languages data, especially concidering that extra_fields table can have many records linked to an Item in Main_Content_Items table - i.e. in 2,3 methods if we have 1000 items in Main_Content_Items table and 4 langs, extra_fields will have 4*1000*5(or 10 or more...) records. That's why I consider this method better at performance than other two...
Nickolay
P.S. About "extra_fields": it's used as universal table for any extra field type(stored as |id|item_id|field_type|field_value(simple text field)|), as there are item groups that can have some specific fields that other items don't(like parameters of notebook and camera in shop-scripts). So we can only abolish "common_data" table moving its fields' data to "extra_fields", but I don't want to do this because in common_data each record have multiple fields data, while extra_fields only one field data per record(see structure), so I think it's better to have 2 separate tables for this.
Nickolay
See my (rather short) edit.
DrColossos