nokogiri

xpath: how to express text nodes ?

consider: text 1 text 2 text 3 how can you express the textnode in xpath ? ...

Xpath: how do you select the second text node (specific text node)

consider a html page <html> apple orange drugs </html> how can you select orange using xpath ? /html/text()[2] doesn't work. ...

why can't i wrap <span> around the rfollowing nokogiri xpath ?

doc = Nokogiri::HTML(open(url)).xpath("//*") .xpath("//*[br]/text()[string-length(normalize-space()) != 0]") .wrap("<span></span>") puts doc it just returns the text ... i was expecting the full html source with now wrapped around the specified xpath elements. ...

nokogiri: how to wrap html tags around given xpath elements ?

I have an xpath to grab each text node which is not surrounded by any html tags. Instead, they are separated via <br>. I would like to wrap these with <span> tags. Nokogiri::HTML(open("http://vancouver.en.craigslist.ca/van/swp/1426164969.html")) .xpath("//br/following-sibling::text()|//br/preceding-sibling::text()").to_a will return t...

wrapping elements with nokogiri ?

given an xpath say can you do something like doc.xpath("/html/body/a").wrap("<span></span>") and wrap all the links with span tags ? ...

how can you manipulate an html page parsed via Nokogiri?

so i parsed an html page using nokogiri. i want to wrap tags around each occurence of links .wrap() doesn't appear to work properly. puts doc.xpath("//a").wrap("<b></b>"); returns just plain regular unchanged html. ...

Disabled/Custom params_parser per action

Hi, I have a create action that handles XML requests. Rather than using the built in params hash, I use Nokogiri to validate the XML against an XML schema. If this validation passes, the raw XML is stored for later processing. As far as I understand, the XML is parsed twice: First the Rails creates the params hash, then the Nokogiri pa...

How to detect mailto links with Hpricot/Nokogiri

I want to match links like <a href="mailto:[email protected]">foo</a>, but this doesn't work only works in Nokogiri: doc/'a[href ^="mailto:"]' What's the right way of doing that? How do I do that with Hpricot? ...

nokogiri: processing just html fragment and returning it

Hello, When I do the following thing with nokogiri: some_html = '<img src="bleh.jpg"/>test<br/>' f = Nokogiri::HTML(some_html) #do some processing puts f It will print the whole xhtml doc structure with the upper code in it. How can I just print/return/get the html part which is in some_html variable? Thanks for help! ...

removing node with nokogiri

Hello. How can I remove img tag/s using nokogiri? I have the following code but it wont work: f = Nokogiri::XML.fragment(str) f.search('//img').each do |node| node.remove end puts f Thanks for help! ...

Help with parsing XML document using 'Reader' and Nokogiri

Hi. I am a newbie when it comes to using Nokogirie reader to parse an xml file. Here is the xml file I want to parse and sample code: <?xml version='1.0' encoding='UTF-8'?> <inventory> <tire name="super slick racing tire" /> <tire name="all weather tire" /> </inventory> -------------------------------------------------------------...

Scanning each HTML node with nokogiri

Hi, How can we scan each element and sub-element of an HTML document with Nokogiri, and testing for each one if the current tag is a block? According to http://wiki.github.com/tenderlove/nokogiri/examples, we can test if an element is a block using: element[:class] == "block" But I don't see how to scan and test each HTML ones... T...

Why doesn't Nokogiri xpath like xmlns declarations

I'm using Nokogiri::XML to parse responses from Amazon SimpleDB. The response is something like: <SelectResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/"&gt; <SelectResult> <Item> <Attribute><Name>Foo</Name><Value>42</Value></Attribute> <Attribute><Name>Bar</Name><Value>XYZ</Value></Attribute> </Item> </S...

Ruby Nokogiri Parsing HTML table

I am using mechanize/nokogiri and need to parse out the following HTML string. can anyone help me with the xpath syntax to do this or any other methods that would work? <table> <tr class="darkRow"> <td> <span> <a href="?x=mSOWNEBYee31H0eV-V6JA0ZejXANJXLsttVxillWOFoykMg5U65P4x7FtTbsosKRbbBPuYvV8nPhET7b5sFeON4aWpbD10Dq"> <span...

nokogiri multiple css classes

How is it possible to select an html element that have two classes. For example, hot to select the element p bellow in an html document (given that it has two css classes) class='class1 class2' : I tried to use the following : doc.xpath("//p[@class~='class1 class2']") doc.xpath("//p[@class~='class1']|[@class~='class2']") doc.xpath("...

How get inner_html of ruby Nokogiri NodeSet unescaped?

I would like to get unescaped inner html from a Nokogiri NodeSet. Does anyone know how to do this? ...

Fastest/One-liner way to get XML nodes into array of "path/to/nodes" in Ruby?

What is the fastest, one-liner/shortest way to get an Array of "strings/that/are/paths" from an XML file, using Nokogiri preferably. I'd like to build the array with an arbitrary attribute name ('id' in this case), but also knowing how to do it for the element name would be helpful. So this: <root id="top"> <nodeA id="almost_top"...

Validating XML using multiple XSD's in Ruby

Hi I'm generating a lot of XMPP stanzas, and want to validate them against the specs available here in my unit tests. At the moment I am using Nokogiri to achieve this with something like xml = Nokogiri::XML( xmpp_stanza) schema = Nokogiri::XML::Schema( xmpp_schema ) assert schema.valid?( xml ) Now this works fine except it gets...

Convert XML collection (of Pivotal Tracker stories) to Ruby hash/object

I have a collection of stories in an XML format. I would like to parse the file and return each story as either hash or Ruby object, so that I can further manipulate the data within a Ruby script. Does Nokogiri support this, or is there a better tool/library to use? The XML document has the following structure, returned via Pivotal Tra...

Nokogiri::XML::Reader doesn't seem to recognize 'content' or 'text' methods.

Hi there, I've got a really simple xml doc (extracted from an html table), and a really simple Nokogiri script, but for some reason I can't get the text out of the xml nodes. I can get attributes, but not the text/content. Anyone have any idea what could be wrong with the following? Here's the xml: <?xml version="1.0" encoding="UTF-8"?...