views:

164

answers:

2

I'm getting the error message: "Invalid byte sequence for encoding "UTF8": 0x9f

Ok, now I know somewhere my php app is trying to query using that 0x9f character. But I have no idea WHERE.

I checked postgresql.conf but I didn't find anything like "log_on_error". There's only the log_statement parameter which causes postgres to log all selects or just all queries. But what I would like to see is this:

ERROR: "Invalid byte sequence for encoding "UTF8": 0x9f
QUERY: SELECT * FROM blabla WHERE field1='blabla0x9f'

In this case I would be able to see which query caused this. So I know in which php script to check.

Is this possible with postgres? My psql version is 8.3.9.

+1  A: 

You are looking for log_min_error_statement for that.

But "invalid byte sequence" comes in the parser before the text is even parsed into a statement. So there is no way to log that without risking logging it in a weird encoding and making it either useless or dangerous.

But presumable your PHP application detects the error? If not, you are not checking enough return codes there ;)

Magnus Hagander
Ok but how can this happen?All my pages are in utf-8 (using charset=utf-8 in html head).My DB is in UTF-8. So how can this happen.Do you know what character that 0x9f thing is?
NovumCoder
The pages are very clearly not being *processed* in UTF-8. Probably your PHP environment isn ot set for UTF8. I'm no PHP expert, so I don't know exactly how to set that. Perhaps default_charset is what you're looking to set in your PHP config?
Magnus Hagander
A: 

Just to be clear, because I don't think Magnus is getting through (though his answer is pretty good): If you read the file in UTF-8, but it is encoded in latin1 it doesn't magically get converted to utf8. You can only work with the file in the encoding of the file, whether submiting it to the db, or re-encoding it in a different encoding. You have three options (that I might use):

Of course, the right-most way is the first one.

  1. Convert the file using php. Read the docs on recode for more information about this.

  2. Change the client_encoding on postgresql, using set client_encoding = encoding. You can find the valid encoding in the docs

  3. Send it to postgresql to be converted read the docs on PostgresSQL's convert()

Information about the php function recode`

Evan Carroll