views:

619

answers:

2
$str = "This is a string containing 中文 characters. Some more characters - 中华人民共和国 ";

How do I detect chinese characters from this string and print the part which starts with the first character and ends with "-"? (it would be "中文 characters. Some more characters -").

Thank you!

A: 

Is PHP storing this as Unicode? If so, at worst you could step through the string, character by character, until you hit those within the Chinese range.

Check this out too PHP: Unicode - Manual

boost
@Josh - if you follow boost's suggestion, to might like to also look at VonC's answer to this question: http://stackoverflow.com/questions/1366068/whats-the-complete-range-for-chinese-characters-in-unicode
JV
@boost, yes, php is storing the string in unicode. But how do I accomplish it? I'm not very good in php.@JV, thanks, I'll take a look at it.
Josh
if you do not convert it to NCR form, it has chance to corrupt the characters during transactions.
Shivan Raptor
+1  A: 

I've solved this problem using preg_match and regular expressions:

$str = "This is a string containing 中文 characters. Some more characters - 中华人民共和国 ";

preg_match(/[\x{4e00}-\x{9fa5}]+.*\-/u, $str, $matches);
Josh
Thanks for this... curious, where is the ability to use \x{unicode#} documented?
philfreo