What's the smartest way to have Nokogiri select all content between the start and the stop element (including start-/stop-element)?
Check example code below to understand what I'm looking for:
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
"<html>
<body>
<p id='para-1'>A</p>
<div class='block' id='X1'>
<p class="this">Foo</p>
<p id='para-2'>B</p>
</div>
<p id='para-3'>C</p>
<p class="that">Bar</p>
<p id='para-4'>D</p>
<p id='para-5'>E</p>
<div class='block' id='X2'>
<p id='para-6'>F</p>
</div>
<p id='para-7'>F</p>
<p id='para-8'>G</p>
</body>
</html>"
HTML_END
parent = value.css('body').first
# START element
@start_element = parent.at('p#para-3')
# STOP element
@end_element = parent.at('p#para-7')
The result (return value) should look like this:
<p id='para-3'>C</p>
<p class="that">Bar</p>
<p id='para-4'>D</p>
<p id='para-5'>E</p>
<div class='block' id='X2'>
<p id='para-6'>F</p>
</div>
<p id='para-7'>F</p>
Update: This is my current solution, though I think there must be something smarter:
@my_content = ""
@selected_node = true
def collect_content(_start)
if _start == @end_element
@my_content << _start.to_html
@selected_node = false
end
if @selected_node == true
@my_content << _start.to_html
collect_content(_start.next)
end
end
collect_content(@start_element)
puts @my_content