I have a data file (an Apple plist, to be exact), that has Unicode codepoints like \U00e8
and \U2019
. I need to turn these into valid hexadecimal HTML entities using PHP.
What I'm doing right now is a long string of:
$fileContents = str_replace("\U00e8", "è", $fileContents);
$fileContents = str_replace("\U2019", "’", $fileContents);
Which is clearly dreadful. I could use a regular expression to convert the \U
and all trailing 0s
to &#x
, then stick on the trailing ;
, but that also seems heavy-handed.
Is there a clean, simple way to take a string, and replace all the unicode codepoints to HTML entities?