views:

2138

answers:

3

I need to translate a Microsoft locale ID, such as 1033 (for US English), into either an ISO 639 language code or directly into a Java Locale instance. (Edit: or even simply into the "Language - Country/Region" in Microsoft's table.)

Is this possible, and what's the easiest way? Preferably using only JDK standard libraries, of course, but if that's not possible, with a 3rd party library.

A: 

The was the first hit on google for "Java LCID" is this javadoc:

gnu.java.awt.font.opentype.NameDecoder

private static java.util.Locale getWindowsLocale(int lcid)

Maps a Windows LCID into a Java Locale.

Parameters:
    lcid - the Windows language ID whose Java locale is to be retrieved. 
Returns:
    an suitable Locale, or null if the mapping cannot be performed.

I'm not sure where to go about downloading this library, but it's GNU, so it shouldn't be too hard to find.

skaffman
We found that too, but it looks rather suboptimal solution for this - it's a "utility class that helps with decoding the names of *OpenType and TrueType fonts*". And looking at the source, the conversion methods seem to be quite lacking - it only knows how to map a couple of the most common locales!
Jonik
Here's the source: http://classpath.sourcearchive.com/documentation/0.91/NameDecoder_8java-source.html See the method which "Maps a Windows LCID into a Java Locale", and note the comment "FIXME: This is grossly incomplete." :P
Jonik
Gah, not good... GNU in "crappy java implementation" shocka.
skaffman
A: 

As it started to look like there is no ready Java solution to do this mapping, we took the ~20 minutes to roll something of our own, at least for now.

We took the information from the horse's mouth, i.e. http://msdn.microsoft.com/en-us/goglobal/bb964664.aspx, and copy-pasted it (through Excel) into a .properties file like this:

1078 = Afrikaans - South Africa
1052 = Albanian - Albania
1118 = Amharic - Ethiopia
1025 = Arabic - Saudi Arabia
5121 = Arabic - Algeria 
...

(You can download the file here if you have similar needs.)

Then there's a very simple class that reads the information from the .properties file into a map, and has a method for doing the conversion.

Map<String, String> lcidToDescription;

public String getDescription(String lcid) { ... }

And yes, this doesn't actually map to language code or Locale object (which is what I originally asked), but to Microsoft's "Language - Country/Region" description. It turned out this was sufficient for our current need.

Disclaimer: this really is a minimalistic, "dummy" way of doing it yourself in Java, and obviously keeping (and maintaining) a copy of the LCID mapping information in your own codebase is not very elegant. (On the other hand, neither would I want to include a huge library jar or do anything overly complicated just for this simple mapping.) So despite this answer, feel free to post more elegant solutions or existing libraries if you know of anything like that.

Jonik
Mapping using English-only names may cause problems. See this answer: http://stackoverflow.com/questions/958178/in-java-is-there-any-way-to-get-a-locale-given-its-display-name/958600#958600
McDowell
In general that's true - but for our particular case those language/country names in English are fine (we're pulling some software information out of an SCCM database, and simply want something more human-readable than the numeric codes)
Jonik
A: 

You could use GetLocaleInfo to do this (assuming you were running on Windows (win2k+)).

This C++ code demonstrates how to use the function:

#include "windows.h"

int main()
{
  HANDLE stdout = GetStdHandle(STD_OUTPUT_HANDLE);
  if(INVALID_HANDLE_VALUE == stdout) return 1;

  LCID Locale = 0x0c01; //Arabic - Egypt
  int nchars = GetLocaleInfoW(Locale, LOCALE_SISO639LANGNAME, NULL, 0);
  wchar_t* LanguageCode = new wchar_t[nchars];
  GetLocaleInfoW(Locale, LOCALE_SISO639LANGNAME, LanguageCode, nchars);

  WriteConsoleW(stdout, LanguageCode, nchars, NULL, NULL);
  delete[] LanguageCode;
  return 0;
}

It would not take much work to turn this into a JNA call. (Tip: emit constants as ints to find their values.)

Sample JNA code:

Using JNI is a bit more involved, but is manageable for a relatively trivial task.

At the very least, I would look into using native calls to build your conversion database. I'm not sure if Windows has a way to enumerate the LCIDs, but there's bound to be something in .Net. As a build-level thing, this isn't a huge burden. I would want to avoid manual maintenance of the list.

McDowell
Thanks! In our case the code needs to run on other platforms (e.g. Linux) too, even though the information we're handling is Windows-centric and comes from an SCCM database. But maybe in some cases this is the best option - I do agree that it's not nice to have to maintain the mappings in a file (even if they rarely change). Btw, if anyone considers doing this using the Windows API, this might be of help: http://stackoverflow.com/questions/1000723/what-is-the-easiest-way-to-call-a-windows-kernel-function-from-java
Jonik