views:

635

answers:

6

We are trying to look at optimizing our localization testing.

Our QA group had a suggestion of a special mode to force all strings from the resources to be entirely contained of X. We already API hijack LoadString, and the MFC implementation of it, so doing it should not be a major hurdle.

My question is how would you solve the formatting issues?

Examples -

CString str ;
str . LoadString ( IDS_MYSTRING ) ;

where IDS_MYSTRING is "Hello World", should return "XXXXX XXXXX"
where IDS_MYSTRING is "Hello\nWorld", should return "XXXXX\nXXXXX"
where IDS_MYSTRING is "Hello%dWorld", should return "XXXXX%dXXXXX"
where IDS_MYSTRING is "Hello%.2fWorld", should return "XXXXX%.2fXXXXX"
where IDS_MYSTRING is "Hello%%World", should return "XXXXX%%XXXXX"

So in summary the string should work if used in a printf or Format statement, it should honor escape characters.

So this is a pure code question, C++/MFC,

CString ConvertStringToXXXX ( const CString& aSource ) 
{
   CString lResult = aSource ;

   // Insert your code here

   return lResult ;
}

I know this could be done using tools on the .RC files, but we want to build English, then run like so -

application -L10NTEST

A: 

You can apply compiler theory here and generate your scanner and parser using flex/bison (lex/yacc, or whatever tools). You can define \w+ as word, which can match both "Hello" and "World" etc..

eed3si9n
+3  A: 

If this approach is to highlight formatted strings (or format sequences) in the application (i.e. all text appearing other than XXXX), you could locate the escape sequence (using regex perhaps) and insert block quotes around the formatted (substituted) values,
e.g. Some\ntext -> Some[\n]text

You get readability (all strings as XXX might be hard to use the application) and also get to detect non-resource (hardcoded) strings.

Having said that, if you're looking to detect non resource loaded strings (hardcoded strings), instead of substituting Xs, why not just prefix the string? You'll easily be able to tell resource loaded strings from hardcoded strings easily,
e.g. Some\ntext -> [EN]Some\ntext

Hope it helps?

RobS
Actually prefixing is a good idea, I will try this out next week, simple solutions are always better.
titanae
A: 

I think what you need is an XXXX locale, if your software supports locales.

You develop it in English, then switch to the XXXX locale to make sure everything is translatable.

Osama ALASSIRY
+1  A: 

The pseudo-localisation feature of appTranslator can help you there: It modifies untranslated strings to use diacritics, text widening or shortening and such. So far, you're not interested. Where it becomes interesting is that it optionally encloses such strings in brackets. The idea was to make it more obvious that a string is pseudo localized. You could use this to detect that the string actually comes from the string table rather than code.

And of course, since the pseudo-localized program must run properly, appTranslator preserves all formatters (including printf-like and FormatMessage-like formatters) and special chars such as % or \n. Which is what you're looking for.

You wouldn't even have to modify your code: Simply create a 'dummy' translation. By 'dummy', I mean a language into which you don't plan to translate your app. Set the language preference of your app to that language. Wait, it's even better: The guys at QA can do it entirely on their own. They dont even have to bother you! :-)

Disclaimer: I'm the author of appTranslator.

Edit: answer to your comment: Glad to read you already use appTranslator. To avoid problems due to dialogs or strings not in the L10N DLL, you can simply re-build the DLLs (e.g. using a post-link step in your VS project). The process automatically re-scans the source exe and merges new and modified texts in the built resource dlls (doesn't affect the appTranslator project file, as opposed to 'Update Source'). This helps make sure your resource DLLs are always in sync with your exe.

Serge - appTranslator
Actually we purchased your product a few years ago, I used to personally use it a lot, I think we may have to revisit it! whilst this solution would work, it changes happening on a daily basis, it still has the potential to introduce errors, dialogs or strings not in the L10N DLL.
titanae
I don't see how the Catalan translation should be more dummy than other translations. Maybe you mean "one translation which is not included in the requirements", but there's quite a lot of software in Catalan so this is not a good example.
Daniel Daranas
Daniel, somehow I knew someone from Catalonia would complain some day :-( And yes, I know lots of software is translated to Catalan. But these "lots" are probably a lot smaller than the lots applicable to German, Japanese, etc... Right?
Serge - appTranslator
@Serge, yes, they are smaller. No offence, I understand it was an example of a language which you happened not to use in your app.
Daniel Daranas
A: 

My final solution was prefixing the string like so "*[resource instance name]original string". It works really well, it shows likely strings that will not fit in say German.

Example:

Original string from appres.dll, "My Application"

New string from appres.dll, "*[appres]My Application".

Thanks for all the suggestions.

titanae
A: 

I prefer a mechanism that we used when I was at Microsoft for pseudo-localization, which involved putting braces around each localized resource. Resource => [-Resource-], for example. Then you can always tell you have a composed string, and formatting doesn't usually change, barring line breaking rules.

We also usually did some string expansion (add various characters around the original string), and some dictionary- or randomization-based character substition (convert "o" to "ö").

Some teams also put the literal resource identifier (the name) as the value of the localized resource, which was more useful for localizers than for testers, because they could see where the resource was actually used in the UI.

JasonTrue