views:

108

answers:

2

I have a Java application with server and Swing client. Now I need to localize the user interface and possibly also some of the data needs to be locale specific. There are few things in specific I would like to hear your opinions on.

  1. How should I distribute the localized strings for the UI into properties files? In my application there are several views and each has several panels. Should I have one localization file per language for each panel or view or should I keep all translations for one language in the same file? I'm currently leaning towards one file per view and language, but I'm not sure how I should handle some domain specific terms which appear in many places. Having the same translation on several files does not sound too good.
  2. The server throws some exceptions that contain a message that should be displayed to the user. I could get the selected locale from the session and handle the localization at the server, but I feel it would be more elegant to keep all localization files at the client. I have been thinking about sending only a localization key from the server with some kind of placeholders for error specific information, which would be sent with the exception. Then the client could construct the message based on the localization key and replace the placeholders with the error specific information. Does that sound like a good way to handle it, or are there other options? Typically my exception messages contain some additional information that changes for each case. It could be for example "A user with username Khilon already exists", in which case the string in the properties file would be something like "A user with username {0} already exists".
  3. The localization of the data is the area that is the most unclear to me. As I'm not sure if it will be ever required, I have so far not planned it very much. The database part sounds straightforward enough, you basically just need an additional table for the strings and a column to tell for which locale the string is. Though I'm not sure if it would be best to have a localization table for each data table (eg Product and Product_names), or could I use one table for localization strings for all the data tables. The really tricky part is how to handle the UI, as to some degree it would be required for an user to enter text for an object in multiple languages. In practice this could mean for example, that a worker in Finland would give the object a name in Finnish and English, and then a worker in another country could translate it to her own language. If any of you has done something similar, I'd be happy to hear how you did it.

I'm very grateful to everybody who can share their experiences with me.

P.S. If you happen to know any exceptionally good websites or books on the subject, I would be happy to hear of them. I have of course done some googling and read some articles about localization, but nothing mind blowing yet.

+1  A: 

Actually, what you are talking about is Internationalization (i18n), not Localization (L10n). From my experience, you are on the right path.

ad 1). One properties file per view and locale (not necessary language, as you may want to use different translations for certain languages depending on country, i.e. using different strings for British an American English thus different locales) is the right approach. Since applications tend to evolve, it could save a good deal of money when you want to modify just one view (as translators will charge you even for something they won't touch - they will have to actually find strings that need to be updated/newly translated). It would be also easier to use with Translation Memory tools if you do it right (new strings at the end of the file all the time).

ad 2). The best idea is to sent out only the resource key from server or other process; other approach could be attaching a resource key and possibly the data (i.e. numeric value) using delimiters, so the message could be recreated and reformatted into local language.

ad 3). I have seen several approaches to localizing Databases, but the best (and it is not only my opinion, but also IEEE members) is to store resource keys and recreate the data on client side using appropriate locale. Of course this goes for pre-installed data, if you let users to enter the data, other issues will arose... There is no silver bullet, one need to think what works best in his/her context. I would lean to including a foreign key column that will identify the language, but it really depends on kind of data that will be stored.

Unfortunately i18n doesn't end here, please remember about correctly formatting dates and numbers so that they will be understandable for people using your program. And also, if you happen to have some list of strings, the sorting order should also depend on locale (it's called collation). Sun used to have (now our beloved Oracle) has quite good i18n trail which you can find here: http://download.oracle.com/javase/tutorial/i18n/index.html .

If you want to read good book on the subject of i18n and L10n, that will save you years of learning these topics (although not necessary will teach you how to program it), there is great book from Microsoft Press: "Developing International Software" - http://www.amazon.com/Developing-International-Software-Dr/dp/0735615837 . It still relevant, although quite old.

Paweł Dyda
Thank you for your excellent answer. I have worked on another project where db contained only keys and the locale strings were at the client properties file. However, this was very problematic when you needed to make any changes to the data, as it required a new release of the client. And in my application users can change everything, so that's not an alternative. As far as the other problems go, dates I've got covered and I'm working on numbers. But I know I'll have some trouble with character sets...
Carlos
+1  A: 

1) I usually keep everything in one file and use names that signify where the properties are used. For example, I prefix with things like "view" and "menu"

  • view.add_request.title
  • view.add_request.contact_information.sectionheader
  • view.add_request.contact_information.first_name.label
  • view.add_request.contact_information.last_name.label
  • menu.admin.user_management.add_user.label
  • menu.admin.user_management.add_role.label

2) Yes, passing around the key makes things simpler and makes the server code easier to test. It also avoids having to pass locale information to the server to have it decide on a language for the client. Its a thick client, so let it handle the localization.

3) I haven't localized data before (usually just labels, and static UI verbage), but I would probably lean towards having a single table with all the localized strings and locales to start with (just to keep it simple). I'm not sure what you're asking about in reference to the UI, but I would suggest you make sure that whatever character-set you're using allows all the languages you want to support. Make sure you read Joel Spolsky's article entitled: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Javid Jamae
Thank you for your answer. How long are your properties files typically? I'm worried that keeping everything in one file would make it a bit hard to handle. The UI part meant that the users will be the ones entering the data to the db, and somehow the UI would have to support entering text for other locales than the one the client is actually using. It is good that you mentioned character sets. I already changed my db to use Unicode and found out that it still leaves problems with some languages, so I have to work on that. I thought everything would be good as long everything is UTF-8...
Carlos