views:

154

answers:

0

We wanted to convert a unicode string in Slovak language into plain ASCII (without accents/carons) That is to do: č->c š->s á->a é->e etc.

We tried:

cstr = Iconv.conv('us-ascii//translit', 'utf-8', a_unicode_string)

It was working on one system (Mac) and was not working on the other (Ubuntu) where it was giving '?' for accented characters after conversion.

Problem: iconv was using LANG/LC_ALL variables. I do not know why, when the encodings are known, but well... You had to set the locale variables to something.utf8, for example: sk_SK.utf8 or en_GB.utf8

Next step was to try to set ENV['LANG'] and ENV['LC_ALL'] in config/application.rb. This was ignored by Iconv in ruby.

Another try was to use global system setting in /etc/default/locale - this worked in command line, but not for Rails application. Reason: apache has its own environment. Therefore the final solution was to add LANG/LC_ALL variables into /etc/apache2/envvars:

export LC_ALL="en_GB.utf8"
export LANG="en_GB.utf8"
export LANGUAGE="en_GB.utf8"

Restarted apache and it worked.

This is more a little how-to than a question. However, if someone has better solution I would like to know about it.