views:

66

answers:

4

It is UTF-8. For example, 情報 is 2 characters while ラリー ペイジ is 6 characters.

A: 

Use mb_strlen för multibyte character encodings like UTF-8.

Emil Vikström
When I use mb_strlen(), it is 6 for 情報 and 19 for ラリー ペイジ
Steven
For some reason i get undefined function for mb_strlen :S
Question Mark
Question Mark, check that you have the mbstring extension installed and activated in PHP.
Emil Vikström
Steven, try to specify which encoding the string is in as the second parameter to mb_strlen. Maybe it failed to detect your encoding automatically.
Emil Vikström
A: 

You can use strlen(utf8_decode('情報'));

Question Mark
+4  A: 

Code

$a = "情報";
$b = "ラリーペイジ";

echo mb_strlen($a, 'UTF-8') . "\n";
echo mb_strlen($b, 'UTF-8') . "\n";

Result

2
6
Peter Lindqvist
Oh BTW i removed the space in ラリー ペイジ
Peter Lindqvist
This will probably get undefined results on characters which have no equivalent in ISO-8859-1.
Emil Vikström
You're probably right about the `utf8_decode()`.
Peter Lindqvist
Code is changed to be more "safe"
Peter Lindqvist
A: 

Even if the second argument of mb_strlen is said optional, it is actually needed, even if your internal encoding and your string's one are the same, for portability purposes.

mb_strlen('情報', 'UTF-8');

The same applies for most multi-byte functions, including mb_substr, for wich the utf8_decode option would not work at all.

Benoit Vidis