views:

52

answers:

2

Ruby works well with Unicode character in File Path and Filenames on Mac OS X and on Linux, but why to make it work on Windows, it took more than 2 years?

I was just looking at Google Code Jam. People are solving non-trivial problems within a few hours. At work, I can imagine solving a filename or path issue having unicode characters even if we need to write it in the standard library to be solvable within a day or two, or a few days, or 1 or 2 weeks? But 2 years?

What might be a reason? I think Mac OS X and Linux might work as it was because they were using UTF-8, and a lot of ASCII program code can work well with UTF-8 without any modification.

Windows might be returning the filenames or path in UTF-16, so it is more complicated, but there are functions to convert UTF-16 to UTF-8 and vice versa, so isn't it a fairly solvable problem?

+2  A: 

It sure is a solvable problem, but I think no one in the core team is using Windows for development. For such topics the OSX/Linux/BSD/... solution is available quickly as in most cases it is just one solution for all this platforms and it is those platforms that are mainly used by the core developers and people close the core (i.e. willing to come up with a fix and offer support). Also, keep in mind that Ruby's main use case is for web apps, and, at least in Ruby land, it's rather uncommon to use Windows for deployment.

For aiding desktop/console applications Ruby is only popular on OSX, as I see it. On Linux Python is rather dominant in this area and on Windows there is no such thing (maybe VBScript, though), as you often don't have small applications interacting with each other (console programms, pipes, KISS principle, UNIX principle, all not very common on Windows, you have to write a service for anything and so on). But I cannot really judge that, as I haven't used Windows in years. Therefore you only have a handful of people really having this issue. And if no one of these people is willing to fix the issue, it takes two years.

Konstantin Haase
+2  A: 

Because you need to have a huge layer between OS and your program to do every trivial operation.

Even fopen() does not work with UTF-8 on Windows. In other words, the reason is that, Windows Unicode API is... crap (sorry all Windows developers)

So supporting Unicode on windows is very hard, while all other OSes live happily with UTF-8.

Artyom
The frustrating thing is that UTF-8 *is* supported in the `MultiByteToWideChar` and `WideCharToMultiByte` functions, but you can't set it as the default encoding. It's like Microsoft went out of their way to make UTF-8 not work >:-(
dan04