tags:

views:

152

answers:

1

I need to strip out all font tags from a document. When attempting to do so with the following Ruby code, other elements and text within the font tags are lost. I've also attempted to iterate through all children elements and make them siblings of the font tag before unlinking the font tag--which also results in lost HTML. What is a good method for removing tags which can contain other elements and/or text?

  doc.css('font').each do |element|
    element.unlink
  end

UPDATE (in response to first solution):

The problem with using node.children to obtain the children and then move the children to the font node's parent node is that none of the children nodes include the text found within the font node. As soon as the font tag is removed (unlinked), all text within the font tag also disappears from the document.

My revised question is thus: how do I use Nokogiri to obtain the text of the font node and how can this text be moved to replace the font tag, in the font node's position.

+3  A: 

The problem is you're lopping off the node, which also trims the child nodes. You need to preserve the children then append them to the parent node. Once you've done that you can delete the target node.

Take a look at "Replace Node w/ Children" - http://rubyforge.org/pipermail/nokogiri-talk/2009-June/000333.html

In that message Aaron is talking about replacing XML nodes, but it's all the same once a HTML document has been parsed by Nokogiri. You'll need to do some minor tweaks but it should get you going.

Greg
Thank you. This is pretty close to what is needed. With HTML content, the ordering of nodes is important. Appending nodes to the parent node will likely end up with the nodes not being in the original order.
sutch
Below is how I solved this issue (sorry about the poor formatting). Basically, this iterates through all children of the node in reverse order and inserts after the node as a sibling.` body.css('font').each do |element| ` ` element.children.reverse.each do |child| `` child_clone = child.clone `` element.add_next_sibling(child_clone) `` child.unlink `` end `` element.unlink `` end`Thanks again for helping me solve.
sutch