tags:

views:

2261

answers:

3

my basic aim is to extract parts of an XML file and make a note that i extracted some part in that file (like "here something was extracted". trying around a lot with nokogiri now, it seems like not really documented on how to

1) delete all childs of a <Nokogiri::XML::Element>
2) change the inner_text of that complete element

any clues?

A: 

you can try and do it like this:

doc=Nokogiri::XML(your_document)
note=doc.search("note") # find all tags with the node_name "note"
note.remove

while that would remove all children within the tag, i am not sure how to "change the inner_text" of lets say all note elements. i think inner_text is not applicable for a Nokogiri::XML::Element

regards, hugo

+1  A: 

Nokogiri makes this pretty easy. Using this document as an example, the following code will find all vitamins tags, remove their children (and the children's children, etc.), and change their inner text to say "Children removed.":

require 'nokogiri'

io = File.open('sample.xml', 'r')
doc = Nokogiri::XML(io)
io.close

doc.search('//vitamins').each do |node|
  node.children.remove
  node.content = 'Children removed.'
end

A given food node will go from looking like this:

<food>
    <name>Avocado Dip</name>
    <mfr>Sunnydale</mfr>
    <serving units="g">29</serving>
    <calories total="110" fat="100"/>
    <total-fat>11</total-fat>
    <saturated-fat>3</saturated-fat>
    <cholesterol>5</cholesterol>
    <sodium>210</sodium>
    <carb>2</carb>
    <fiber>0</fiber>
    <protein>1</protein>
    <vitamins>
     <a>0</a>
     <c>0</c>
    </vitamins>
    <minerals>
     <ca>0</ca>
     <fe>0</fe>
    </minerals>
</food>

to this:

<food>
    <name>Avocado Dip</name>
    <mfr>Sunnydale</mfr>
    <serving units="g">29</serving>
    <calories total="110" fat="100"/>
    <total-fat>11</total-fat>
    <saturated-fat>3</saturated-fat>
    <cholesterol>5</cholesterol>
    <sodium>210</sodium>
    <carb>2</carb>
    <fiber>0</fiber>
    <protein>1</protein>
    <vitamins>Children removed.</vitamins>
    <minerals>
     <ca>0</ca>
     <fe>0</fe>
    </minerals>
</food>
Pesto
A: 

The previous Nokogiri example set me in right direction, but using doc.search kinda left a mal-formed //vitamins. So I used css..

require "rubygems"
require "nokogiri"

f = File.open("food.xml")
doc = Nokogiri::XML(f)

doc.css("food vitamins").each do |node|
  puts "\r\n[debug] Before: vitamins= \r\n#{node}"
  node.children.remove
  node.content = "Children removed"
  puts "\r\n[debug] After: vitamins=\r\n#{node}"
end
f.close

Results in:

debug] Before: vitamins= 
<vitamins>
        <a>0</a>
        <c>0</c>
    </vitamins>

[debug] After: vitamins=
<vitamins>Children removed</vitamins>
eking