views:

3573

answers:

4

How do I remove accentuated characters from a string ? Especially in IE6, I had something like this :

        accentsTidy = function(s){
            var r=s.toLowerCase();
            r = r.replace(new RegExp(/\s/g),"");
            r = r.replace(new RegExp(/[àáâãäå]/g),"a");
            r = r.replace(new RegExp(/æ/g),"ae");
            r = r.replace(new RegExp(/ç/g),"c");
            r = r.replace(new RegExp(/[èéêë]/g),"e");
            r = r.replace(new RegExp(/[ìíîï]/g),"i");
            r = r.replace(new RegExp(/ñ/g),"n");                
            r = r.replace(new RegExp(/[òóôõö]/g),"o");
            r = r.replace(new RegExp(/œ/g),"oe");
            r = r.replace(new RegExp(/[ùúûü]/g),"u");
            r = r.replace(new RegExp(/[ýÿ]/g),"y");
            r = r.replace(new RegExp(/\W/g),"");
            return r;
        };

but IE6 bugs me, seems it doesn't like my regular expression.

+8  A: 

The format for new RegExp is

RegExp(something, 'modifiers');

So you would want

accentsTidy = function(s){
                        var r=s.toLowerCase();
                        r = r.replace(new RegExp("\\s", 'g'),"");
                        r = r.replace(new RegExp("[àáâãäå]", 'g'),"a");
                        r = r.replace(new RegExp("æ", 'g'),"ae");
                        r = r.replace(new RegExp("ç", 'g'),"c");
                        r = r.replace(new RegExp("[èéêë]", 'g'),"e");
                        r = r.replace(new RegExp("[ìíîï]", 'g'),"i");
                        r = r.replace(new RegExp("ñ", 'g'),"n");                            
                        r = r.replace(new RegExp("[òóôõö]", 'g'),"o");
                        r = r.replace(new RegExp("œ", 'g'),"oe");
                        r = r.replace(new RegExp("[ùúûü]", 'g'),"u");
                        r = r.replace(new RegExp("[ýÿ]", 'g'),"y");
                        r = r.replace(new RegExp("\\W", 'g'),"");
                        return r;
                };
Ian Elliott
+1 for Ian. @subtenante see: https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Global_Objects/RegExp
Jonathan Fingland
Couldn't get this code to work. Make sure to set the doc encoding to UTF8!
Sam V
Works well. Thanks.
jesperlind
+5  A: 

You can create regex's in multiple ways. Using the new Regex constructor:

var re = new RegExp("[a-z]", "ig") //(string patter, string modifiers)

Or using the regex literal notation:

var re = /[a-z]/ig; // /patern/modifiers

You have mixed the two.

Pim Jager
A: 

function removeAccents(strAccents){
    strAccents = strAccents.split('');
    strAccentsOut = new Array();
    strAccentsLen = strAccents.length;
    var accents = 'ÀÁÂÃÄÅàáâãäåÒÓÔÕÕÖØòóôõöøÈÉÊËèéêëðÇçÐÌÍÎÏìíîïÙÚÛÜùúûüÑñŠšŸÿýŽž';
    var accentsOut = ['A','A','A','A','A','A','a','a','a','a','a','a','O','O','O','O','O','O','O','o','o','o','o','o','o','E','E','E','E','e','e','e','e','e','C','c','D','I','I','I','I','i','i','i','i','U','U','U','U','u','u','u','u','N','n','S','s','Y','y','y','Z','z'];
    for (var y = 0; y < strAccentsLen; y++) {
        if (accents.indexOf(strAccents[y]) != -1) {
            strAccentsOut[y] = accentsOut[accents.indexOf(strAccents[y])];
        }
        else
            strAccentsOut[y] = strAccents[y];
    }
    strAccentsOut = strAccentsOut.join('');
    return strAccentsOut;
}
Álister
+1  A: 

Assuming you know what you're doing, I suspect IE6 is not interpreting the file's encoding correctly, and hence not recognising the non-ASCII characters in the file:

  • Make sure the file is saved as UTF-8 (say)
  • Use Fiddler or some other tool to check that the web server is sending the correct Content-Encoding HTTP header.

(It "smells" wrong though, I'd look into doing the sorting, say on the server using something that's locale-aware... but anyway...)

Duncan Smart
It's an old question I just edited. You're right, it was a matter of encoding (IE6 does not recognize the charset 'utf-8' but only 'UTF-8'). Although there is no point is making a sorting on the server, especially when you display a long table with several sortable columns. +1 though for the encoding mention which was the final real problem.
subtenante