views:

55

answers:

2

I have an arial character giving me a headache. U+02DD turns into a question mark after I turn its document into a phpquery object. What is an efficient method for removing the character in php by referring to it as 'U+02DD'?

+1  A: 

Use preg_replace and do it like this:

$str = "your text with that character";

echo preg_replace("#\x{02DD}#u", "", $str); //EDIT: inserted the u tag for unicode

To refer to large unicode ranges, you can use preg_replace and specify the unicode character with \x{abcd} pattern. The second parameter is an empty string that. This will make preg_replace to replace your character with nothing, effectively removing it.


[EDIT] Another way:

Did you try doing htmlentities on it. As it's html-entity is ˝, doing that OR replacing the character by ˝ may solve your issue too. Like this:

echo preg_replace("#\x{02DD}#u", "˝", $str);
shamittomar
@shamittomar - The preg_replace method returned this warning: preg_replace() [function.preg-replace]: Compilation failed: character value in \x{...} sequence is too large at offset 7. I copy and pasted your pattern.
JMC
@JMC, Sorry I forgot to put the trailing `u` tag (for unicode) in the preg_replace. I have updated the code. It will not give any error now and will successfully replace it.
shamittomar
@shamittomar - If I wanted to use preg_replace on "˝" how would I write the expression?
JMC
@JMC. Updated the answer for "how to write with `˝`".
shamittomar
+3  A: 

You can use iconv() to convert character sets and strip invalid characters.

<?PHP
 /* This will convert ISO-8859-1 input to UTF-8 output and 
  * strip invalid characters
  */
 $output = iconv("ISO-8859-1", "UTF-8//IGNORE", $input);

 /* This will attempt to convert invalid characters to something
  * that looks approximately correct.
  */
 $output = iconv("ISO-8859-1", "UTF-8//TRANSLIT", $input);
?>

See the iconv() documentation at http://php.net/manual/en/function.iconv.php

spuriousdata
@spuriousdata - +1 this is a good suggestion. I had to use this in combination with shamittomar's answer to send the fancy quote mark back into the ground where it belongs. I believe it is a spawn of satan.
JMC