views:

186

answers:

1

I am importing data from a CSV file. One of the fields has an accent(Telefónica O2 UK Limited). The application throws en error while inserting the data to the table.

PGError: ERROR:  invalid byte sequence for encoding "UTF8": 0xf36e6963
HINT:  This error can also happen if the byte sequence does not match the 
encoding expected by the server, which is controlled by "client_encoding".
: INSERT INTO "companies" ("name", "validated") 
    VALUES(E'Telef?nica O2 UK Limited', 't')

The data entry through the forms works when I enter names with accents and umlaut. How do I workaround this issue?

Edit

I addressed the issue by converting the file encoding. I uploaded the CSV file to Google docs and exported the file to CSV.

+5  A: 

The error message is pretty clear: Your client_encoding setting is set to UTF8 and you try to insert a character which isn't encoded in UTF8 (if it's a CSV from MS Excel, your file is probably encoded in Windows-1252 instead).

You could either convert it in your application or you can alter your PostgreSQL connection to match the encoding you want to insert (thus enabling PostgreSQL to do the conversion for you). You can do so by executing SET CLIENT_ENCODING TO 'WIN1252'; on your PostgreSQL connection before trying to insert that data. After the import you should reset it to its original value with RESET CLIENT_ENCODING;

HTH!

Henning
+1 - You can also try to convert the file by hand, eg with iconv. But give you some time to understand what are you doing; trial and error do not work nice here. A programmer in year 2010 must understand the basics of Unicode and charset encoding.
leonbloy
It was am issue with the file encoding. I uploaded the CSV file to Google docs and exported the file again. That fixed the encoding issue.
KandadaBoggu