tags:

views:

32

answers:

2

I'm using fopen to read in a csv and fgetcsv to read the csv lines. The csv is encoded as Windows-1252, how do I convert this to UTF-8 so it doesn't cut lines with none standard characters?

So far I've tried the following:

setlocale(LC_ALL, 'en_GB.UTF-8');

and

drupal_convert_to_utf8($csv_line[3], 'Windows-1251'); // (I'm using Drupal 6.16)

Both seem to fail.

Any help or general pointers in the right direction would be great

+1  A: 

Hi there, you can use iconv for this kind of work.

Guillaume Lebourgeois
I've looked into iconv using the following:$str = 'Felleskjøpet';echo iconv('Windows-1252', 'UTF-8', $str);This outputs:Felleskj¿pet
digital
Are you sure your input is in `Windows-1252` ? What makes you belive that ?
Guillaume Lebourgeois
When I saved the test php the enc type was wrong setting it to Windows-1525 works.My next issue is getting it to work with an array of data.
digital
+1  A: 

I do not know the drupal_convert_to_utf8 function, but have a look at mb_convert_encoding.

Try using mb_list_encodings to be sure your implementation supports Windows-1252. If not try using ISO 8859-1, it is basicly the same ( http://en.wikipedia.org/wiki/Windows-1252 ).

You should also make sure your csv actually is in Windows-1252. Try using mb_detect_encoding for this and make use of the strict flag.

Daniel Baulig

related questions