I'm trying to fill the variables parent_element_h1
and parent_element_h2
. Can anyone help me use the Nokogiri Gem to get the information I need into those variables?
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
"<html>
<body>
<p id='para-1'>A</p>
<div class='block' id='X1'>
<h1>Foo</h1>
<p id='para-2'>B</p>
</div>
<p id='para-3'>C</p>
<h2>Bar</h2>
<p id='para-4'>D</p>
<p id='para-5'>E</p>
<div class='block' id='X2'>
<p id='para-6'>F</p>
</div>
</body>
</html>"
HTML_END
parent = value.css('body').first
# start_here is given: A Nokogiri::XML::Element of the <div> with the id 'X2
start_here = parent.at('div.block#X2')
# this should be a Nokogiri::XML::Element of the nearest, previous h1.
# in this example it's the one with the value 'Foo'
parent_element_h1 =
# this should be a Nokogiri::XML::Element of the nearest, previous h2.
# in this example it's the one with the value 'Bar'
parent_element_h2 =
PLEASE NOTE: The start_here
element could be anywhere inside the document. The html-data shown here is just an example. That said, the headers (<h1>
and <h2>
) could be a sibling of start_here
or a child of a sibling of start_here
.
The following recursive method is a good starting point, but it doesn't work on <h1>
because it's a child of a sibling of start_here
.
def search_element(_block,_style)
unless _block.nil?
if _block.name == _style
return _block
else
search_element(_block.previous,_style)
end
else
return false
end
end
parent_element_h1 = search_element(start_here,'h1')
parent_element_h2 = search_element(start_here,'h2')
UPDATE: After accepting an answer, I came up with my own solution. Check it out (somewhere below), it works like a charm and I think it's pretty cool. :-)