views:

494

answers:

2

Is there a way to select all the contents of a node in Nokogiri?

<root>
    <element>this is <hi>the content</hi> of my æøå element</element>
</root>

The result of getting the content of /root/element should be this is <hi>the content</hi> of my æøå element

Edit:

It seems like the solution is simply to use myElement.inner_html(). The problem I had was in fact that I was relying on an old version of libxml2, which escaped all the special characters.

A: 
Nokogiri.parse('<root><element>this is <hi>the content</hi> of my element</element></root>').css('element').inner_html

If you want escape that you can with CGI.unescape method

require 'cgi'
x = Nokogiri.parse('<root><element>this is <hi>the content</hi> of my element</element></root>').css('element').inner_html
CGI.unescape(x)
shingara
Nokogiri.parse('<root><element>this is <hi>the content</hi> of my æøå element</element></root>').css('element').inner_html.inspect=> "\"this is <hi>the content</hi> of my æøå element\""
Styggentorsken
You can CGI.unescape the result
shingara
Thank you! CGI.unescapeHTML() worked
Styggentorsken
A: 

I think the previous answer is assuming html. Not sure that's appropriate, so here's my (similar) answer:

require 'nokogiri'
xml = '<root><element>this is <hi>the content</hi> of my æøå element</element></root>' 
p Nokogiri(xml).at('element').to_xml
Levi