tags:

views:

1119

answers:

3

I'm wondering if there's a way to count the words inside a div for example. Say we have a div like so:

<div id="content">
hello how are you?
</div>

Then have the JS function return an integer of 4.

Is this possible? I have done this with form elements but can't seem to do it for non-form ones.

Any ideas?

g

+12  A: 

If you know that the DIV is only going to have text in it, you can KISS:

var count = document.getElementById('content').innerHTML.split(' ').length;

If the div can have HTML tags in it, you're going to have to traverse its children looking for text nodes:

function get_text(el) {
    ret = "";
    var length = el.childNodes.length;
    for(var i = 0; i < length; i++) {
        var node = el.childNodes[i];
        if(node.nodeType != 8) {
            ret += node.nodeType != 1 ? node.nodeValue : get_text(node);
        }
    }
    return ret;
}
var words = get_text(document.getElementById('content'));
var count = words.split(' ').length;

This is the same logic that the jQuery library uses to achieve the effect of its text() function. jQuery is a pretty awesome library that in this case is not necessary. However, if you find yourself doing a lot of DOM manipulation or AJAX then you might want to check it out.

EDIT:

As noted by Gumbo in the comments, the way we are splitting the strings above would count two consecutive spaces as a word. If you expect that sort of thing (and even if you don't) it's probably best to avoid it by splitting on a regular expression instead of on a simple space character. Keeping that in mind, instead of doing the above split, you should do something like this:

var count = words.split(/\s+/).length;

The only difference being on what we're passing to the split function.

Paolo Bergantino
You'll have to get the text node first.
altCognito
perfect! thank you
givp
This will count tags as words though, which is why I would prefer the text() version provided by jQuery.
altCognito
I know, I was working as soon as I posted it to port text over to plain javascript to provide that as an alternative. Not everyone needs jQuery in their lives. :)
Paolo Bergantino
+1, I do like that you took the logic for text from the jQuery library :)
altCognito
You should cache el.childNodes.length in a local variable - currently you're querying it on every iteration.
J-P
I'm not sure how much HTML you'd really need to make that matter. :) A heck of a lot, I'd say. But, I guess, I must acquiesce your request.
Paolo Bergantino
Thanks you :) Much better. +1
J-P
You should better use a regular expression to take multiple whitespace characters into account.
Gumbo
Yet another fair point. Fixed. :)
Paolo Bergantino
Would `.textContent` (or `.innerText` for IE) not be sufficient instead of the descending traversal?
Crescent Fresh
@Paolo Bergantino: bug fix for your code, u better use: var count = (words.length ? words.split(/\s+/).length : 0) cause split returns an array with one empty string if words is an empty string, so you would get one word count.
Marco Demajo
A: 
document.deepText= function(hoo){
    var A= [];
    if(hoo){
     hoo= hoo.firstChild;
     while(hoo!= null){
      if(hoo.nodeType== 3){
       A[A.length]= hoo.data;
      }
      else A= A.concat(arguments.callee(hoo));
      hoo= hoo.nextSibling;
     }
    }
    return A;
}

I'd be fairly strict about what a word is-

function countwords(hoo){
    var text= document.deepText(hoo).join(' ');
    return text.match(/[A-Za-z\'\-]+/g).length;
}
alert(countwords(document.body))
kennebec
A: 

see related topic here JavaScript DHTML Count Words

http://java.pakcarid.com/Cpp.aspx?sub=366&amp;ff=2955&amp;topid=34&amp;sls=25

lois