tags:

views:

821

answers:

3

Hi all,

I am on Mac Os X 10.5 (but I reproduced the issue on 10.4)

I am trying to use iconv to convert an UTF-8 file to ASCII

the utf-8 file contains characters like 'éàç'

I want the accented characters to be turned into their closest ascii equivalent

so

my command is this :

iconv -f UTF-8 -t ASCII//TRANSLIT//IGNORE myutf8file.txt

which works fine on a Linux machine

but on my local Mac Os X I get this for instance :

è => 'e

à => `a

I really dont undersatnd why iconv returns this weird output on mac os x but all is fine on linux

any help ? or directions ?

thanks in advance

A: 

my guess is that on your linux machine the locale is set differently... as far as I can remember, iconv uses the current locale to translate UTF-X, and by default the macos has the locale set to "C" which (obviously) does not handle accents and language specific characters... maybe try doing this before running iconv:

setLocale( LC_ALL, "en_EN");

|K<

kent
hi thank you but this seems not to be the problem as I have changed the locale but it has not changed anything. Or I don't know how to actually change the locale I am doing this in my .bash_profile export LC_ALL=fr_FR.UTF-8 and running locale returns : LANG= LC_COLLATE="fr_FR.UTF-8" LC_CTYPE="fr_FR.UTF-8" LC_MESSAGES="fr_FR.UTF-8" LC_MONETARY="fr_FR.UTF-8" LC_NUMERIC="fr_FR.UTF-8" LC_TIME="fr_FR.UTF-8" LC_ALL="fr_FR.UTF-8" hope this helps finding an answer
A: 

does anyone have a fix for this?

+1  A: 

The problem is that Mac OSX uses another implementation of iconv called libiconv. Most Linux distributions have an implementation of iconv which is part of libc. Unfortunately libiconv transliterates characters such as ö, è and ñ as "o, `e and ~n. The only way to fix this is to download the source and modify the translit.h file in the lib directory. Find lines that look like this:

2, '"', 'o',

and replace them with something like this:

1, 'o',

I spent hours on google trying to figure out the answer to this problem and finally decided to download the source and hack around with it. Hope this helps someone!