views:

242

answers:

6

A while back I was reading the W3C article on 'Re-using Strings in Scripted Content', which contains some useful advice on internationalisation, but which strikes me as at odds iwth the DRY (Don't Repeat Yourself) principle of eliminating repetitive code.

To take their example, we might have some code like this...

print "The printer is ";
if (printer.working) {
    print "on.\n";
} else {
    print "off.\n";
}

print "The stapler is ";
if (stapler.working) {
    print "on.\n";
} else {
    print "off.\n";
}

My instinct would be to eliminate the repetition roughly as follows...

report-state(printer, "printer");
report-state(stapler, "stapler");

function report-state(name, object) {
    print "The "+name+" is ";
    if (object.working) {
        print "on\n";
    } else {
        print "off\n";
    }
}

...but doing so would cause a difficulty in the code if we needed to localise it to Spanish because the word for 'on' is apparently different in those two cases.

So, I guess my question is, how have other developers approached balancing the DRY principle with internationalisation of their code?

Part of me wants to argue that internationalisation is one of those extreme programming “you arent gonna need it” situations. On the flip side however, refactoring with the DRY principle in mind is supposed to balance this by making it easy to implement functionality as it’s required, not harder as it does here.

A: 

I would suggest using a CMS rather than hardcoding in your textual values to cover localisation.

badbod99
+12  A: 

I'd try to keep complete sentences in the language resource. As you said you might need different words in different contexts. But a bigger problem is that the order of sentences might be different in different languages. So building up strings from words can cause problems.

Just store

The printer is on
The printer is off
The stapler is on
The stapler is off

in the language resource for every language. The repetition here is less of a maintenance headache than trying to figure out where all the single words are going to pop up in your application.

Mendelt
+2  A: 

I agree with Mendelt Siebenga when he says you should keep entire sentences or phrases in your language resource files. Differences in grammar will always prevent you from doing single word replacement across languages. This will still lead to less repetitive code than your first example because you only need to check the object type and its state, then print the appropriate message from the language resource.

Bill the Lizard
+1  A: 

I suppose it depends on the level of language quality that you are aiming to achieve.

By trying to minimise repetition of code that deals with these real language strings, you are just exposing yourself to a whole other layer of logic in the syntaxes and structures of different languages. There would be a massive amount of work involved in producing code which still retains the original structure of the language whilst minimising repetition.

You'd have to decide which was a more suitable approach to a particular problem; Code that repeats itself, or code that tries to be a Jack of all Trades and accomodates for countless rules of language (no doubt a maintenance nightmare).

Of course, you can strike a middle-ground and minimise your code repitition but give up satisfactory grammatical eloquence. Take the example of Ultima Online - when it was localised, a string that previously read "A pile of 329 gold coins" became something like "A pile of gold coins: 329". Not great, but a fairly reasonable solution that lends itself easily to localisation.

C.McAtackney
+2  A: 

We try not to create message strings by program manipulation because the loc. team can't see them.

The loc. team actually prefer separate but nearly duplicate messages. However they will accept parameterized messages.

E.g., "The %(appliance)% is %(on_or_off)%."

The parameters can break down but at least it's more obvious to the loc team when it will work and when it won't.

maccullt
+3  A: 

100% agree with Mendelt.

It is not only a maintenance problem, but can also be a linguistic one. In all Latin languages the gender, number, and case of the subject affect other elements. Example for Romanian

  The printer is on: Imprimanta este pornită // feminine
  The printer is off: Imprimanta este oprită
  The stapler is on: Perforatorul este pornit // masculine
  The stapler is off: Perforatorul este oprit

Also see http://www.mihai-nita.net/article.php?artID=20060430a