views:

204

answers:

2

I have a script that counts the characters in each of my comments, excluding any Html Tags.

But it doesn't take into account that my comments contain åäöÅÄÖ (swedish letters). So how do I edit this to "exclude" these from the regexp variable? (If the comment is "Hej då!" the result is 6, not 7.)

Why I need this is a long story, the problem here is in the expression, not that I could use CSS and set a max-height and overflow. :)

// check if comments are too long
$("ol li p").each(function() {
 var regexp = /<("[^"]*"|'[^']*'|[^'">])*>/gi;
 var count = $(this).html().replace(regexp,"").length;
 if ( count >= 620 ) {
  $(this).parent().addClass("too-big-overflow");
 };
});
+4  A: 

There's no need to use a regular expression here. This should work:

$("ol li p").each(function() {
    var count = $(this).text().length;
    if ( count >= 620 ) {
        $(this).parent().addClass("too-big-overflow");
    }
});
J-P
Thanks! Big but though, this includes the white-space, right? -- I must have screwed up from the get go, it turns out that my code-editor's the one miscounting the letters, not the script. -- I'll answer my Q with this and the alternative I worked out on my own.
elundmark
+1  A: 

This works, but includes any and all white-space

$("ol li p").each(function() {
    var count = $(this).text().length;
    if ( count >= 620 ) {
        $(this).parent().addClass("too-big-overflow");
    }
});

As it was pointed out to me, this script above will work on swedish letters, although it includes white-space. To avoid that, and as an alternative for swedish text, I ended up using this script below. It strips out the html first, then uses text().length together with RegEx to include all common swedish letters, along with typical code letters like { [ ( ) ] } if your comments contain lots of that.

$("ol li p").each(function() {
    // This removes any tags inside the text like <abbr>, <span> etc
    var regexp = /<[^>]+>/gi;
    var strippedHtml = $(this).text().replace(regexp,"");
    // This counts all common (swedish) letters used, not including the white-space in your html
    lettersCounted = strippedHtml.match(/[a-z0123456789åäö_,éèáà´`'~ ½§£@&%#"-:;<>!\[\]\\\^\$\.\|\?\*\+\(\)\{\}]/gi).length;
    if ( lettersCounted >= 620 ) {
        $(this).parent().addClass("too-big-overflow");
    };
});
elundmark