tags:

views:

24

answers:

2

I have set the html meta tag to uft8 Also the form charset to utf8 To be uber sure i have run the php function utf8_encode() on the offending string

even after all this i am getting a postgres error when executing the query via the php function pg_query.

Yet when i copy and paste the query into a command line it executes fine.

The query (or the offending part at least): INSERT INTO locations (... alt_thorfare_name ... ) VALUES (... 'BÓTHAR GHLEANN/SHEÁIN' ...);

The error: Query failed: ERROR: invalid byte sequence for encoding UTF8: 0xd354 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by client_encoding

The problem being the fada (acute accent from the irish language) on the O and A.

I am totally stuck on what to do next to even attempt finding a solution. Any help appreciated

+1  A: 

To be uber sure i have run the php function utf8_encode() on the offending string

This is where things go wrong, you don't have to encode utf8 to utf8, it's already utf8.

Frank Heikens
+1  A: 

That error occurs because you're not actually passing UTF-8 data to pg_query.

0xd354 is not actually a valid UTF-8 sequence, it's an ISO-8859-1 sequence for "ÓT". See:

header("Content-type: text/plain");
$s = "ÓT";
$utf8 = mb_convert_encoding($s, "UTF-8", "HTML-ENTITIES");
$u = unpack("H*", $utf8);
echo "UTF-8 byte sequence: " . reset($u),"\n";
$iso8859 = mb_convert_encoding($s, "ISO-8859-1", "HTML-ENTITIES");
$u = unpack("H*", $iso8859);
echo "ISO-8859-1 byte sequence: " . reset($u),"\n";

gives

UTF-8 byte sequence: c39354
ISO-8859-1 byte sequence: d354

Do this:

  • Don't use utf8_encode() unless maybe you get ASCII data from somewhere (external plain text file, for example).
  • Encode your PHP files as UTF-8 without BOM.
Artefacto