views:

1577

answers:

5

We are localising the user-interface text for a web application that runs on Java 5, and have a dilemma about how we output messages that are defined in properties files - the kind used by java.util.Properties.

Some messages include a placeholder that will be filled using java.text.MessageFormat. For example:

search.summary = Your search for {0} found {1} items.

MessageFormat is annoying, because a single quote is a special character, despite being common in English text. You have to type two for a literal single quote:

warning.item = This item''s {0} is not valid.

However, three-quarters of the application's 1000 or so messages do not include a placeholder. This means that we can output them directly, avoiding MessageFormat, and leave the single quotes alone:

help.url = The web page's URL

Question: should we use MessageFormat for all messages, for consistent syntax, or avoid MessageFormat where we can, so most messages do not need escaping?

There are clearly pros and cons either way.

Note that the API documentation for MessageFormat acknowledges the problem and suggests a non-solution:

The rules for using quotes within message format patterns unfortunately have shown to be somewhat confusing. In particular, it isn't always obvious to localizers whether single quotes need to be doubled or not. Make sure to inform localizers about the rules, and tell them (for example, by using comments in resource bundle source files) which strings will be processed by MessageFormat.

+1  A: 

Just write your own implementation of MessageFormat without this annoying feature. You may look at the code of SLF4J Logger.

They have their own version of message formatter which can be used as followed:

logger.debug("Temperature set to {}. Old temperature was {}.", t, oldT);

Empty placeholders could be used with default ordering and numbered for some localization cases where different languages do permutations of words or parts of sentences.

Boris Pavlović
Is it possible to use that library to just format a String though?
Peter Hilton
it seems that this mechanism is buried deeply into the code. maybe you'll have more luck: http://logback.qos.ch/download.html
Boris Pavlović
Indeed, so I guess I'll look for an alternative implementation.
Peter Hilton
A: 

Another alternative...When loading the properties file, just wrap the inputstream in a FilterInpuStream that doubles up every single quote.

Rob Di Marco
Not all of his messages require two single quotes. Which means he has to hack in recognition of messageformat strings. Eek.
JonMR
+1  A: 

Use the ` character instead of ' for quoting. We use it all the time without problems.

Use MessageFormat only when you need it, otherwise they only bloat up the code and have no extra value.

dhiller
A: 

In my opinion, consistency is important for this sort of thing. Properties files and MessageFormat already have lots of limitations. If you find these troubling you could "compile" your properties files to generate properly-formed ones. But I'd say go with using MessageFormat everywhere. This way, as you maintain the code you don't need to worry about which strings are formatted and which aren't. It becomes simpler to deal with, since you can hand off message processing to a library and not worry about the details at a high level.

Mr. Shiny and New
+1  A: 

In the end we decided to side-step the single quote problem by always using ‘curly’ quotes:

warning.item = This item\u2019s {0} is not valid.
Peter Hilton