In this kind of situation, I generally start with the string I have copy-pasted from word :
$str = 'Danâ’s back !';
var_dump($str);
And, going byte-by-byte in it, I output the hexadecimal code of each byte :
for ($i=0 ; $i<strlen($str) ; $i++) {
$byte = $str[$i];
$char = ord($byte);
printf('%s:0x%02x ', $byte, $char);
}
Which gives an output such as this one :
D:0x44 a:0x61 n:0x6e �:0xc3 �:0xa2 �:0xe2 �:0x80 �:0x99 s:0x73 :0x20 b:0x62 a:0x61 c:0x63 k:0x6b :0x20 !:0x21
Then, with a bit of guessing, luck, and trial-and-error, you'll find out that :
â
is a character that fits on two bytes : 0xc3 0xa2
- and the special-quote is a character that fits on three bytes :
0xe2 0x80 0x99
Hint : it's easier when you don't have two special characters following each other ;-)
After that, it's only a matter of using str_replace to replace the correct sequence of bytes by another character ; for example, to replace the special-quote by a normal one :
var_dump(str_replace("\xe2\x80\x99", "'", $str));
Will give you :
string 'Danâ's back !' (length=14)