views:

36

answers:

1

My web app is breaking when I try edit a certain content type and I'm pretty sure it is because of some weird characters in my database. So when I do:

SELECT body FROM message WHERE id = 666

it returns:

<p>⢠<span></span></p><p><br /></p><p><em><strong>NOTE:</strong> Please remember to use your to participate in the discussion.</em></p>

However when I try to count how many documents have those characters postgres complains:

foo_450_prod=# SELECT COUNT(*) FROM message WHERE body LIKE'%â¢%';

ERROR:  invalid byte sequence for encoding "UTF8": 0xe2a225
HINT:  This error can also happen if the byte sequence does not match the encodi

Does anybody know what the issue is and how I can query for those funny characters?

Thanks in advance!

A: 

there's already a long way between your DB and printing some data from it in your webpage : your DB encoding may be ok, but you're probably trying here to print something originally in UTF-8 in ISO-8859-1 (and not "funny" characters). do you have something like :

<meta content="text/html; charset=UTF-8" http-equiv="content-type" />

in the <head> tag of your HTML page?

also, are you setting SET NAMES 'utf8' when connecting to your DB?

darma
hummm maybe but the stack traces tell me it is a sql error...
hdx
but you get the error only when you try a SELECT with the wrong encoding (LIKE'%â¢%'), right?
darma
If the DB is utf-8, then it sure sounds like the web page is not in UTF-8.
NinjaCat
@darma yes, but the query I posted is one I'm using to investigate the issue on the database side not the one my webapp is using, sorry for the confusion. What is really puzzling me is that both of those characters are legal utf8 characters and pg won't let me query for it...
hdx