views:

108

answers:

1

Hi,

i have an HTML, that should be transformed, having some tags replaced with another tags.

I don't know about these tags, because they will come from db. So, "set_attribute" or "name" methods of Nokogiri are not suiteable for me

I need to do it, in a way, like in this pseudo-code:

def preprocess_content
  doc = Nokogiri::HTML( self.content )
  doc.css("div.to-replace").each do |div|
    # "get_html_text" will obtain HTML from db. It can be anything, even another tags, tag groups etc.
    div.replace self.get_html_text
  end
  self.content = doc.css("body").first.inner_html
end

I found "Nokogiri::XML::Node::replace" method. I think, it is a right direction.

This method expects some "node_or_tags" parameter.

Which method should i use to create a new Node from text and replace the current one with it ?

A: 

Like that:

doc.css("div.to-replace").each do |div|
    new_node = doc.create_element "span"
    new_node.inner_html = self.get_html_text
    div.replace new_node
end
floatless
It does't works for me. I gen an errror: "no contextual parsing on unlinked nodes". It complains in this way for line, where "inner_html" property is set
AntonAL
I've just tested, and it works in my environment. Try to replace `new_node.inner_html =` with `new_node.content =` and double check for errors, please. It should.
floatless
Thanks, i've figured it out. The problem was - modifying yet unlinked element to the DOM. We must first replace, and then modify.It's nice, but i faced a more confusing problem - not any markup can be inserted as a replacement. For example, when i say "new_node.inner_html = "<div>Test</div>" - it works, but when i say new_node.inner_html = "<video src='my_video.mov'></video>", it crashes with a message "undefined method `children' for nil:NilClass" ...
AntonAL
This happens because of HTML strictness. Replace `Nokogiri::HTML( self.content )` with `Nokogiri::XML( self.content )` and do not forget to add a DOCTYPE declaration manually later.
floatless
Thanks, it seems to work, but how can i convert all the stuff back to string, to save it ? I do following: "self.content = doc.css("body").first.inner_html", but it complains about nil class, that is returned by "first". Howewer, for Nokogiri:HTML it does work
AntonAL
It also works for me... What do `doc.to_s` and `p doc.css("body")` output?
floatless
I've figured it out - XML does't has a body :) But, another issue: this "doc.root.to_s" returns the only parts, that i've replaced, having another parts, that was around - deleted... This is strange now
AntonAL