views:

143

answers:

2

I was wondering if there were any way to define the default encoding for htmlentities(). I have a big project going that uses htmlentities calls all over the place, and was wondering if there was a simple way to set it from ISO-8859-1 to UTF-8 as the default character encoding, using something simple like init_set. Or possibly with a separate namespace declaration.

Failing that, I would not be opposed to renaming and overriding the htmlentities function to always use Unicode, but am reluctant to install anything as freaky (to me) as PECL apd.

+3  A: 

As the manual page doesn't say anything about changing the default charset, I don't think there is a way to do that ; and I don't remember having ever seen anything about that.

I wouldn't use anything like apd either -- instead, I would probably :

  • create my own function, that calls htmlentities with the right parameters
  • and replace every call to htmlentities by a call to my new function (this can probably be done automatically, using a few lines of scripts)
Pascal MARTIN
A: 

@Pascal MARTIN's solution is definitely correct, you can also use utf8-encode to convert ISO-8859-1 to UTF-8.

And utf8_decode to convert UTF-8 to ISO-8859-1.

Jay Zeng
The problem is not the charset the string is in, it's how htmlentities deals with it.
amphetamachine