ansaurus

Question

Generate Raw String Meta Description for HTML Page from HTML String in Ruby?

Answer 1

A:

Does it have to be in ruby? I can I write it in PHP:

$text = '<html> ...';
$result = preg_replace(array('/\\n+/', '/\\[ts]/', '/"/'), array('. ', ' ', '\''), html_entity_decode(strip_tags($text)));

dmitrig01 2010-01-03 23:09:02

Answer 2

A:

Hmm. That seems like a rather lot of functionality for a one-liner. If you just want to parse and display an HTML page as plain text, I'd recommend using w3m.

string = "..." # your string

IO.popen("w3m -T text/html", "r+") do |pipe|
  pipe.write string
  pipe.close_write
  puts pipe.read
end

Gives me:

My Page Title

Production Manager

    “I want my passion for business plan and my pride in my work to show in
    every step of our company: from the labels and papers, to our relationships
    with our customers, to the enjoyment of each bottle of My Company business
    plan. As we expand our production, my dream is to plant a company of my own
    to specialize in good business, my personal favorite varietal.”

- John Smith

Born and raised on the north coast of California, John Smith always felt a deep
connection to this......

For the rest of the substitutions, I'd recommend applying a regexp replace either before or after processing, depending on your exact needs.

Brian Campbell 2010-01-03 23:09:30

ansaurus

tags:

views:

answers:

Generate Raw String Meta Description for HTML Page from HTML String in Ruby?

related questions