views:

151

answers:

2

I asked a very similar question a while back and I was wondering if correctly sorting an array with UTF-8 chars got a little easier with the new improvements of PHP 5.3+.

The solution provided in my previous question works, but I'm looking for a universal solution; one that doesn't depend on the locale specified - kind of what MySQL does with the UTF-8 collation.

Thanks in advance!

+3  A: 

Short answer is: you do need to be aware of the locale.

Don't confuse the charset with the locale sorting rules. UTF-8 is just a way to encode Unicode characters: it doesn't imply anything about how you handle sorting, capitalization, etc.

I'll put a simple example. The Spanish language has two collations: traditional (where "ch" is considered a letter) and modern (where "ch" are two letters). In traditional collation you sort this way:

  1. Barro
  2. Cuenco
  3. China
  4. Dado

In modern collation you'd sort this way:

  1. Barro
  2. China
  3. Cuenco
  4. Dado

This is the same in UTF-8, Latin1, Latin9, cp850 or whatever: the encoding is not relevant.

Álvaro G. Vicario
+1  A: 

The problem with locales in PHP is that they are not thread safe. If you run Apache threaded, you practically can't use setlocale since it affects all threads.

Now, I just found what looks like a solution: The Collator class in the Intl extension. It has methods for string comparison and sorting. Docs are here: http://php.net/manual/en/class.collator.php

Viktor Söderqvist
Thanks, I know about the `Collator` class but I still have to construct it with a given locale.
Alix Axel