tags:

views:

325

answers:

3

Hello,

When I do the following thing with nokogiri:

some_html = '<img src="bleh.jpg"/>test<br/>'
f = Nokogiri::HTML(some_html)
#do some processing
puts f

It will print the whole xhtml doc structure with the upper code in it. How can I just print/return/get the html part which is in some_html variable?

Thanks for help!

A: 

What do you mean by the 'html' part?

Just do f.text() to get the inner text.

CodeJoust
A: 

No.

f will return:

"<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www
.w3.org/TR/REC-html40/loose.dtd\">\n<html><body>\n<img src=\"bleh.jpg\">test<br>\n
</body></html>\n"

I only want the inner/fragment part:

<img src=\"bleh.jpg\">test<br>

Thanks for help!

Aljaz
+2  A: 

Instead of parsing using Nokogiri::HTML(...) use Nokogiri::HTML::fragment(...)

asdf = Nokogiri::HTML::fragment('<img src="bleh.jpg">test<br>')
print asdf.to_html
# >> <img src="bleh.jpg">test<br>
Greg