views:

218

answers:

4

When I wrote in JavaScript "Ł" > "Z" it returns true. In Unicode order it should be of course false. How to fix this? My site is using UTF-8.

+2  A: 

You may be able to build your own sorting function using localeCompare() that - at least according to the MDC article on the topic - should sort things correctly.

If that doesn't work out, here is an interesting SO question where the OP employs string replacement to build a "brute-force" sorting mechanism.

Also in that question, the OP shows how to build a custom textExtract function for the jQuery tablesorter plugin that does locale-aware sorting - maybe also worth a look.

Edit: As a totally far-out idea - I have no idea whether this is feasible at all, especially because of performance concerns - if you are working with PHP/mySQL on the back-end anyway, I would like to mention the possibility of sending an Ajax query to a mySQL instance to have it sorted there. mySQL is great at sorting locale aware data, because you can force sorting operations into a specific collation using e.g. ORDER BY xyz COLLATE utf8_polish_ci, COLLATE utf8_german_ci.... those collations would take care of all sorting woes at once.

Pekka
Thx. for links. It's little shame that JavaScript doesn't support it in core, but still it's working solution.
Tomasz Wysocki
Be careful with `localeCompare()` in IE6: http://blog.schmichael.com/2008/07/14/javascript-collation-fail/
BalusC
@BalusC the comments in that article claim that it's in fact Wine's fault, not IE6's. Can't find anything else on the issue to confirm or disprove it, and I'm too lazy to build a test case right now... @Tomasz if you go this route, it would be interesting to hear whether things work well in IE6.
Pekka
Oh, I didn't see the comment before. In any way, to avoid unforeseen browser inconsistenties (I still don't have a solid feeling around `localeCompare()`), I'd implement a custom one like Tomalak did in your linked topic.
BalusC
@BalusC I agree that's probably the best and most solid way to go.
Pekka
From "hard-code" approaches I most like Mic's one. Its more explicit than replace-compare one. localeCompare would be great but don't work for all browser/configurations (ie. don't work for my Google Chrome, but works fine on Fifrefox (same computer)).
Tomasz Wysocki
A: 

The code for Ł is 321 The code for Z is 90

321 > 90 = true

"L" is 76 so maybe you have a typo?

Silkster
Sorting strings is not about the codes. If it was: `Z` is 90 `a` is 97 : 90 < 97.
Tomasz Wysocki
In fact, it's not even about characters. German "ß" should sort as "ss" - it's a string operation.
MSalters
+3  A: 

Here is an example for the french alphabet that could help you for a custom sort:

var alpha = function(alphabet, dir, caseSensitive){
  return function(a, b){
    var pos = 0,
      min = Math.min(a.length, b.length);
    dir = dir || 1;
    caseSensitive = caseSensitive || false;
    if(!caseSensitive){
      a = a.toLowerCase();
      b = b.toLowerCase();
    }
    while(a.charAt(pos) === b.charAt(pos) && pos < min){ pos++; }
    return alphabet.indexOf(a.charAt(pos)) > alphabet.indexOf(b.charAt(pos)) ?
      dir:-dir;
  };
};

To use it on an array of strings a:

a.sort(
  alpha('ABCDEFGHIJKLMNOPQRSTUVWXYZaàâäbcçdeéèêëfghiïîjklmnñoôöpqrstuûüvwxyÿz')
);

Add 1 or -1 as the second parameter of alpha() to sort ascending or descending.
Add true as the 3rd parameter to sort case sensitive.

You may need to add numbers and special chars to the alphabet list

Mic
If you are using this code, also see: http://stackoverflow.com/questions/3630645/how-to-compare-utf-8-strings-in-javascript/3633725#3633725
Tomasz Wysocki
+1  A: 

Mic's code improved for non-mentioned chars:

var alpha = function(alphabet, dir, caseSensitive){
  dir = dir || 1;
  function compareLetters(a, b) {
    var ia = alphabet.indexOf(a);
    var ib = alphabet.indexOf(b);
    if(ia === -1 || ib === -1) {
      if(ib !== -1)
        return a > 'a';
      if(ia !== -1)
        return 'a' > b;
      return a > b;
    }
    return ia > ib;
  }
  return function(a, b){
    var pos = 0;
    var min = Math.min(a.length, b.length);
    caseSensitive = caseSensitive || false;
    if(!caseSensitive){
      a = a.toLowerCase();
      b = b.toLowerCase();
    }
    while(a.charAt(pos) === b.charAt(pos) && pos < min){ pos++; }
    return compareLetters(a.charAt(pos), b.charAt(pos)) ? dir:-dir;
  };
};

function assert(bCondition, sErrorMessage) {
      if (!bCondition) {
          throw new Error(sErrorMessage);
      }
}

assert(alpha("bac")("a", "b") === 1, "b is first than a");
assert(alpha("abc")("ac", "a") === 1, "shorter string is first than longer string");
assert(alpha("abc")("1abc", "0abc") === 1, "non-mentioned chars are compared as normal");
assert(alpha("abc")("0abc", "1abc") === -1, "non-mentioned chars are compared as normal [2]");
assert(alpha("abc")("0abc", "bbc") === -1, "non-mentioned chars are compared with mentioned chars in special way");
assert(alpha("abc")("zabc", "abc") === 1, "non-mentioned chars are compared with mentioned chars in special way [2]");
Tomasz Wysocki