views:

125

answers:

1

Suppose you have a large document with around ~7000 words. I need to send all data to server. I have no chance to use jquery, prototype etc. It should be clean OO javascript. Sample page would be json russian page I will exclude all tags and html markup from words.

My question is;

  • 1. How can i collect/harvest all (utf8) words from document?
  • 2. How can i convert the result to JSON data?

  • Thanks

    +2  A: 

    This really doesn't seem like a job for Object Oriented programming. A sexy recursive function would work much better.

     var output=[];
    
     function scan(element) {
        var children=element.childNodes;
        for (var child in children){
            if (children[child].nodeValue) {
                output.push(children[child].nodeValue);
            }else {
                scan(children[child]);
            };
        };
     };
    
     scan(window.document.body);
    

    This doesn't break the text up into individual words or even produce JSON, but it will give you a list of the individual words. You still need to do some cleanup on the text. In my 2 seconds of testing I found that it likes to display the text of everything including javascript and newlines (\n). Maybe if I feel like it I'll add more code. But this should get you going.

    For turning it into JSON try Douglas Crockford's toJSON code. Just google it.

    so simple. code organization not take in to account json data. it is not fit to question :) thank you anyway.
    nodevalue != single word
    http://www.w3schools.com/JS/default.asp
    Thank you for link grandmaster. Youre da great as Houdini