tags:

views:

80

answers:

2

I'm using Kohana 3, which has full support for Unicode.

I have this as the first child of my <head>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

The Unicode character I am inserting into is é as in Café.

However, I am getting the triangle with a ? (as in could not decode character).

As far as I can tell in my own code, I am not doing any string manipulation on the text.

In fact, I have placed the accent straight into a view's PHP file and it is still not working.

I copied the character from this page: http://www.fileformat.info/info/unicode/char/00e9/index.htm

I've only just started examining PHP's Unicode limitations, so I could be doing something horribly wrong.

So, how do I display this character? Do I need to resort to the HTML entity?

Update

So this works

Caf<?php echo html_entity_decode('&#233;', ENT_NOQUOTES, 'UTF-8'); ?>

Why does that work? If I copy the output accented e from that script and insert it into my document, it doesn't work.

A: 

I guess, you see �, the replacement character for invalid UTF-8 byte sequences. Your text is not UTF-8 encoded. Check your editor’s settings to control the encoding of the PHP file.

If you’re not sure about the encoding of your sources, you can enforce UTF-8 compatibilty as described here (German text): Force UTF-8.

You should never need entities except the basic ones.

toscho
I'm using Coda, and I just switched *Default File Encoding* to Unicode (UTF-8) and saved but it hasn't fixed it.
alex
Try deleting the character, switching to utf8 in editor, now paste character. The file size should be 2 bytes larger with that character present when utf8 encoded.
chris
@chris Still no good.
alex
+1  A: 

View the http headers. You should see something like

Content-Type: text/html; charset=UTF-8

Browsers don't pay much attention to meta tags, if there was a real http header stating a different encoding.

update

Whatcha get from this?

echo bin2hex('é');
echo chr(0xc3) . chr(0xa9);

You should get c3a9é, otherwise I'd say file encoding issue.

chris
+1 HTTP headers are often forgotten, but much more important than meta tags. You can use `header()` to output an appropriate header, Kohana might have its own wrapper around it too.
deceze
Already checked that, and yes the charset is UTF 8
alex
Re Update: I got `8eé`
alex
I would try a different editor.
chris