views:

111

answers:

1

is there a way to load a chunk of html into an Hpricot::Doc object?

I am trying to parse various chunks of html within custom tags from a page.

so if I have:

<foo>
  <b>here is some stuff</b>
  <table>
    <tr>
      <td>one</td>
      <td>two</td>
    </tr>
    <tr>
      <td>three</td>
      <td><four</td>
    </tr>
  </table>
</foo>

I would love to be able to get foo and it's contents within an Hpricot::Doc object because I am going to need to do some additional processing and eventually swap() it so that foo and all its children are replaced in the document.

I know I can iterate by the children of foo, but I was hoping there was a way to grab it all in one chunk to keep things clean. Also, may or may not have attributes. There will be many items, each with a chunk of HTML, but no foo item will contain another foo item.

Is this at all possible? Lastly, I started with Hpricot, but I am open to Nokogiri if it would make a difference.

+1  A: 

I'm not clear on what you are having trouble with.

You can pass hpricot your html any way you like.

From the Readme

doc = Hpricot("<p>A simple <b>test</b> string.</p><foo>foo content</foo>")

You can search for foo and swap it

doc.search("//foo").first.swap "<blink>not foo</blink>"
BaroqueBobcat