tags:

views:

62

answers:

4

I am trying to figure out a way to count a words in a particular string that contains html.

Example String:

<p>Hello World</p>

Is there a way in Ruby to count the words in between the p tags? Or any tag for that matter?

Examples:

<p>Hello World</p>
<h2>Hello World</h2>
<li>Hello World</li>

Thanks in advance!

Edit (here is my working code)

Controller:

class DashboardController < ApplicationController
  def index
    @pages = Page.find(:all)
    @word_count = []
  end

end

View:

<% @pages.each do |page| %>

        <%  page.current_state.elements.each do |el| %>
            <% @count = Hpricot(el.description).inner_text.split.uniq.size  %>
            <% @word_count << @count %>
        <% end %>

            <li><strong>Page Name: <%= page.slug %> (Word Count: <%= @word_count.inject(0){|sum,n| sum+n } %>)</strong></li>

<% end %>
A: 

You'll want to use something like Hpricot to remove the HTML, then it's just a case of counting words in plain text.

Here is an example of stripping the HTML: http://underpantsgnome.com/2007/01/20/hpricot-scrub/

amarsuperstar
A: 

First start with something able to parse HTML like Hpricot, then use simple regular expression to do what you want (you can merely split over spaces and then count for example)

Jack
A: 

Sure

  1. Use Nokogiri to parse the HTML/XML and XPath to find the element and its text value.
  2. Split on whitespace to count the words
willcodejavaforfood
+4  A: 

Here's how you can do it:

require 'hpricot'
content = "<p>Hello World...."
doc = Hpricot(content)
doc.inner_text.split.uniq

Will give you:

[
  [0] "Hello",
  [1] "World"
]

(sidenote: the output is formatted with awesome_print that I warmly recommend)

Thibaut Barrère
This is just what I needed! Thanks!!!!!
dennismonsewicz
Glad it helped :)
Thibaut Barrère
I put in my working code example above
dennismonsewicz
I suggest you move the code to a dedicated helper function at some point. It will make it easier to unit-test it and reuse it.
Thibaut Barrère
How do you go about doing that? I am new to Rails
dennismonsewicz