ansaurus

Question

Turning HTML character entities to 'regular' letters... why is it only partially working?

Answer 1

A:

It could be you are using a character coding that is different than your page, ISO v.s. UTF-8, for example.

Diodeus 2010-03-01 22:03:41

Answer 2

A:

chr only works on ASCII, so your non-ASCII characters are getting messed up. Unless I'm misunderstanding what you're trying to do, you just need a single call to html_entity_decode() with the correct charset parameter, and can get rid of the other two lines.

Scott Reynen 2010-03-01 23:28:46

Answer 3

A:

Although the name doesn’t reflect it, html_entity_decode does also convert numeric character references.

// α (U+03B1) == 0xCEB1 (UTF-8)
var_dump("\xCE\xB1" == html_entity_decode('&#x03B1;', ENT_COMPAT, 'UTF-8'));

Gumbo 2010-03-01 23:39:34

Answer 4

+1 A:

â€™ is what you get when you read the UTF-8 encoded character ’ (RIGHT SINGLE QUOTATION MARK, U+2019) as if it were encoded as windows-1252. In other words, you have two problems: you're using the wrong encoding to read the wrong character.

HTML attribute values are supposed to be enclosed in ASCII apostrophes or quotation marks, not curly quotes. The numeric entities you're converting should be ' or &#x27 (apostrophe) or " or " (quotation mark). Instead, you appear to have , which represents the same character as ’, &#8217, or ’.

As for the second problem, the resulting text seems to be encoded as UTF-8, but at some point it's being read as if it were windows-1252. In UTF-8, the character ’ is represented by the three-byte sequence E2 80 99, but windows-1252 converts each byte separately, to â, €, and ™. Wherever that's happening, it's not in the code you showed us.

The good news is that your preg_replace code seems to be working correctly. ;) But I think the others are right when they say you can use html_entity_decode() alone for that part.

Alan Moore 2010-03-02 03:34:01

ansaurus

tags:

views:

answers:

Turning HTML character entities to 'regular' letters... why is it only partially working?

related questions