views:

161

answers:

6

I have recently used modified a word count method in javascript for my website, so that it counts the intial amount words in the textarea, but it doesn't quite work

function wordCounter(field,countfield)
{
    var maxlimit = 200;
    var wordcounter = maxlimit - information.value.split(' ').length;
    for (x = 0; x < field.value.length; x++) 
    {
        if(field.value.charAt(x) == " " && field.value.charAt(x-1) != " ") // Counts the spaces while ignoring double spaces, usually one in between each word.
        {
            wordcounter++ 
        }

        if (wordcounter > 250) 
        {
            field.value = field.value.substring(0, x);
        }
        else
        {
            countfield.value = maxlimit - wordcounter;
        }
    }
}
+1  A: 

The simple way would be to count the number of spaces and add 1.

Edit: add example

This essentially does that. Splitting by the spaces

var str = 'adfs asdf a asdf';
alert(str.split(/\s+/).length);
Galen
Well, and add 1 to the end result. ;-)
Wim Hollebrandse
That's a +1, +1 that.
Wim Hollebrandse
You could even refine this to avoid multiple spaces being counted as more words. This would be a simple matter of searching your string and replacing any instances of `__` with `_` (substitute spaces for underscores, of course). When all instances of multiple spaces are dealt with, count up spaces. It's not perfect, of course, but automated word counts are usually just a decent approximation.
Frank DeRosa
A: 

A simpler method would be to use JavaScript RegEx to count the words.

Paul Sasik
+2  A: 

The easiest solution is to count the number of non-consecutive whitespace characters (spaces, tabs, etc.), plus one.

RegEx:

\S\s

JavaScript:

var str = "The fox jumped over the lazy dog.";
var wordcount = str.match(/\S\s/g).length + 1;

Note that I didn't use "\s+", because I don't need to match all whitespace, only the whitespace characters that occur after a non-whitespace. This has two advantages:

  1. Slightly smaller overhead when the string has many duplicated whitespace characters.
  2. Won't return an extra word in the count if the input starts with whitespace.

Many answers here use split() instead. The only advantage to split() is not having to add 1 to the answer, but IMHO, match() is the better answer because the code is more readable. The purpose of the code is to find word boundaries, not to split the words.

Also, though match() and split() return arrays, match() has a smaller memory overhead, for two reasons:

  • One fewer element (not a big deal)
  • It only returns two characters in each array element (could be significant)
richardtallent
I just like the split() version more because it is semantically more pleasing, it really does count the words and you don't have to understand why there is +1 in the end. Also, this code gives an incorrect result if the string is right-padded with whitespace.
Jaanus
+5  A: 

Given string "s", you do this:

var numWords = s.replace(/^\s+|\s+$/g,"").split(/\s+/).length;

This splits the string at all whitespaces (spaces, linebreaks etc), also working with multiple spaces etc. EDIT: added inline trim to strip whitespace from beginning/end.

Jaanus
for some odd reason no matter what solution I use it keeps adding two to the count instead of one
death the kid
Tiny issue: this will overstate the word count if the input is left-padded with a whitespace character.
richardtallent
richardtallent—thanks for the remark, correct, added inline trim to fix.
Jaanus
u better check 1st for s not being empty string in such case split function would return an array with one empty string so you would get 1 word count instead of right value that is zero.
Marco Demajo
A: 

x = "a b f d"; alert('x has ' + x.split(/\s+/).length + ' words')

prime_number
A: 

I'm not sure if I understand what exactly do you need (the code you pasted confuses me a bit), but here's a simple function for word count:

var text = "1 2 3 4000";

function wordCounter( text ) {
    word_count = text.split(" ");

    return word_count.length;
}

wordCounter( text );    // returns 4 as expected
Ondrej Slinták
This won't split on tab, CR, LF, etc., but should if the OP is dealing with multi-line text. This will also give the wrong answer if multiple spaces are used between words or sentences (two spaces after a full stop is a common unfortunate convention with some people).
richardtallent
Ah, yeah, jeez, my fault. RegEx version is simpler and cleaner anyway.
Ondrej Slinták