views:

254

answers:

10

Most people would agree that internationalizing an existing app is more expensive than developing an internationalized app from scratch.

Is that really true? Or when you write an internationalized app from scratch the cost of doing I18N is just being amortized over multiple small assignments and nobody feels on his shoulders the whole weight of the internationalization task?

You can even claim that a mature app has many and many LOC that where deleted during the project's history, and that they don't need to be I18Ned if internationalization is made as an after thought, but would have been I18N if the project was internationalized from the very beggining.

So do you think a project starting today, must be internationalized, or can that decision be deferred to the future based on the success (or not) the software enjoys and the geographic distribution of the demand.

I am not talking about the ability to manipulate unicode data. That you have for free in most mainstream languages, databases and libraries. I am talking specifically of supporting your own software's user interface in multiple languages and locales.

+13  A: 

"when you write an internationalized app from scratch the cost of doing I18N is ... amortized"

However, that's not the whole story.

Retroactively tracking down every message to the users is -- in some cases -- impossible.

Not hard. Impossible.

Consider this.

theMessage = "Some initial part" + some_function() + "some following part";

You're going to have a terrible time finding all of these kinds of situations. After all, some_function just returns a String. You don't know if it's a database key (never shown to a person) or a message which must be translated. And when it's translated, grammar rules may reveal that a 3-part string concatenation was a dumb idea.

You can't simply GREP every String-valued function as containing a possible I18N message that must be translated. You have to actually read the code, and possibly rewrite the function.

Clearly, when some_function has any complexity to it at all, you're stumped as to why one part of your application is still in Swedish while the rest was successfully I18N'd into other languages. (Not to pick on Swedes in particular, replace this with any language used for development different from final deployment.)

Worse, of course, if you're working in C or C++, you might have some of this split between pre-processor macros and proper C-language syntax.

And in a dynamic language -- where code can be built on the fly -- you'll be paralyzed by a design in which you can't positively identify all the code. While dynamically generating code is a bad idea, it also makes your retroactive I18N job impossible.

S.Lott
Your example provides another case-study. I18n is not as simple as replacing strings with another language, you must design your system to work nicely with i18n. In this case, concatenating strings makes i18n almost impossible until that code is replaced. Grammar is not the same in other languages!
Chris
Good point. But that doesn't mean you have to write a fully-internationalized app from the beginning. It just means you should keep internationalization in mind when designing the application. (Which, I think, is always a good advice!)
nikie
"Impossible" is far too strong a word. "Impossible without refactoring", sure. It's fair to say internationalisation is more expensive to do later, but that doesn't mean to say it can't be deferred until after you've already got a customer who'll pay you for it.
Iain Galloway
"Impossible without refactoring" is the functional equivalent of "impossible". If you have to refactor or rewrite, you're not doing I18N, your rewriting, which is a broader (and less well-defined) activity.
S.Lott
@nikie: If you "design with I18N in mind" then all your message strings are in properties (or configuration) files and your application gets those external strings. That *is* I18N, and there's no further work to do. You *are* fully internationalized. All that's left is to use `locale` for dates and numbers. Which you should always do anyway.
S.Lott
@S. Lott: for me "design with I18N in mind" means: write your code in a way that simply GREPing/replacing string literals at a later time is all you have to do for I18N. That means, no string concatenation code or similar. But you can still use string literals. If you do that, I18N later isn't any more difficult than I18N from the start.
nikie
@nikie: That's the point. Designing with I18N is cheaper than trying to fix a program that was *not* designed with I18N in mind. Simply avoiding concatenation is a start. For only a tiny bit more up-front cost, all string literals can be put into a properties file. Either way, starting with I18N is clearly cheaper than reworking.
S.Lott
By popular vote
flybywire
+4  A: 

I'm going to have to disagree that it costs more to add it to an existing application than from scratch with a new one.

  • A lot of the time i18n is not required until the application gets 'big'. When you do get big, you will likely have a bigger development team to devote to i18n so it will be less of a burden.
  • You may not actually need it. A lot of small teams put great effort to support internationalization when you have no customers who require it.
  • Once you have internationalized, it makes incremental changes more time consuming. It doesn't take a lot of extra time but every time you need to add a string to the product, you need to add it to the bundle first and then add a reference. No it is not a lot of work but it is effort and does take a bit of time.

I prefer to 'cross that bridge when we come to it' and internationalize only when you have a paying customer looking for it.

Chris Dail
Less of a time burden, but if 4 developers are working on it to reduce the time, that's 4 developers that are being paid...
pete the pagan-gerbil
While that's true, Chris is also (reasonably) assuming you also have a client paying you for those 4 developers. Certainly it's probably more expensive to do it later (particularly because you may need to seriously refactor your application), but it's less risky. Would you rather spend two person-months on internationalisation once you've already got a revenue stream, or would you rather delay your revenue stream for one person-month to get it in from the start?
Iain Galloway
Reasonable point but poorly worded IMHO (no offence). I believe it definitely costs more to add i18n to an existing app than to work from scratch, so your first paragraph is wrong. However you are correct that it may still be the **right business decision** to delay i18n: perhaps you need quick revenue to survive, or perhaps you need the early success to raise funds for the i18n effort (which I think is what you mean by "having a big development team" - headcount is one thing but you also need money in the bank to pay them if you are going to do i18n).
MarkJ
A: 

I cant say what is expensive but, i can tell you that a clean API lets you internationalise your Aplication at very low cost.

streetparade
A: 

If you truly think you get "unicode handling" "for free", you may have a surprise coming your way when you try.

Unless you use a framework that has proven i18n ability beyond languages with the ANSI or very similar character sets, you will find several niggles and more major issues where the unicode handling isn't quite right, or simply unavailable. Even with relatively common languages (e.g. German) you can run into difficulty with shrinking or expanding letter counts and APIs that don't support unicode.

And then think of languages with different reading-ordering!

This is one of the reasons you should really plan it in from the beginning, and test the stuff to destruction on the set of languages you plan to support.

MadKeithV
The biggest surprise may come when you realize different cultures don't agree on how to sort strings or even whether two strings are equal or not. Some cultures treat two strings of Unicode characters as identical (they may even have a different number of characters)
MarkJ
A: 

The concept of i18n and l10n is broader than merely translating strings to and fro some languages.

Example: Consider the input of date and time by users. If you haven't internationalization in mind when you design

a) the interface for the user and

b) the storage, retrieval and display mechanism

you will get a really bad time, when you want to enable other input schemes.

Agreed, in most cases i18n is not necessary in the first place. But, and that is my point, if you don't spend a thought on some areas, that must be touched for i18n, you will find yourself ending up rewriting large portions of the original code. And then, adding i18n is a lot more expensive than having spent some thought beforehand.

Boldewyn
A: 

One thing that seems like it can be a big issue is the different character counts for a message in various languages. I do some work on iPhone apps and especially on a small screen if you design the UI for a message that has 10 characters and then you try to internationalize later and find you need 20 characters to display the same thing you now have to redo your UI to accommodate. Even with desktop apps this can still be a large PITA.

jamone
A: 

Yes, internationalizing an existing app is definitely more expensive than developing the app as internationalized from day one. And it's almost never trivial.

For instance

Message = "Do you want to load the " & fileType() & " file?"

cannot be internationalised without some code alterations because many languages have grammatical rules like gender agreement. You often need a different message string for loading every possible file type, unlike in English when it's possible to bolt together substrings.

There are many other issues like this: you need more UI space because some languages need more characters than English to express the same concept, you need bigger fonts for East Asia, you need to use localised date/times in the user interface but perhaps English US when communicating with databases, you need to use semicolon as a delimeter for CSV files, string comparisons and sorting are cultural, phone numbers & addresses...

So do you think a project starting today, must be internationalized, or can that decision be deferred to the future based on the success (or not) the software enjoys and the geographic distribution of the demand?

It depends. How likely is the specific project to be internationalised? How important it is to get a first version fast?

MarkJ
A: 

It depends on your project and how your team is organised.

I've been involved in the internationalization of a website, and it was one developer full-time for a year, probably about 6-8 months part-time for me to handle installation impacts when needed (reorganising files, etc), and other developers getting involved from time to time when their projects needed heavy refactoring. This was in an application that was at v3.

So that's definitely expensive. What you have to ask is how expensive is it to provide a localization system from the start, and how will that impact the project in the early stages. Your project at v1 may not be able to survive delays and setbacks caused by issues with a hastily-designed internationalization framework, while a stable v3 project with a wide customer base may have the capital to invest in doing that properly.

It also depends on whether you want to internationalize everything including log messages, or just the UI strings, and how many of those UI strings there are, and who you have available to do localization and the QA that goes with it, and even what languages you want to support - for example, does your system need to support unicode strings (which is a requirement for Asian languages).

JohnL
A: 

And don;t forget that changing the database backend to support internationalized data can be costly as well. Just try to change that varchar field to nvarchar when you already have 20,000,000 records.

HLGEM
A: 

I think it depends on a language. Every j2ee(java web) app is i18n, because its very easy(even IDE can extract strings for you and you just name them).

In j2ee its cheaper to add it later, however the culture is to add them as soon as possible. I think its because j2ee uses a lot of open-source and almost all open-source libs are i18n. its great idea for them, but not for most j2ee app. most enterprise apps are just for one company that speak one language.

Plus if you have bad testers putting it too soon makes them give you bug reports about labels and translations(I only once saw translations done NOT by developers). After testers are done with it you have buggy app with excellent i18n support. However it might be fun for users to switch language and see if they can use it. However using your app its just boring work for them, so they wont even do that. The only users of i18n are the testers.

Weird string joining is not in j2ee culture since you know that one day someone might want to make it i18n. Only problem is extracting labels from html templates.

01