I have the following HTML:
<html>
<body>
<h1>Foo</h1>
<p>The quick brown fox.</p>
<h1>Bar</h1>
<p>Jumps over the lazy dog.</p>
</body>
</html>
...and by using the RubyGem Nokogiri (a hpricot replacement), I'd like to change it into the following HTML:
<html>
<body>
<p class="title">Foo</p>
<p>The quick brown fox.</p>
<p class="title"...
I'd like to figure out a way on how to get to the HTML result (mentioned further below) by using the following Ruby code and the Nokogiri Rubygem:
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
"<html>
<body>
<p id='1'>A</p>
<p id='2'>B</p>
<h1>Bla</h1>
<p id='3'>C</p>
...
I use Nokogiri (Rubygem) css search to look for certain <div> inside my html. It looks like Nokogiri's css search doesn't like regex. I would like to switch to Nokogiri's xpath search as this seems to support regex in search strings.
How do I implement the (pseudo) css search mentioned below in an xpath search?
require 'rubygems'
requi...
I'm trying to fill the variables parent_element_h1 and parent_element_h2. Can anyone help me use the Nokogiri Gem to get the information I need into those variables?
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
"<html>
<body>
<p id='para-1'>A</p>
<div class='block' id='X1'>
...
I am attempting to get a gem I've just installed working in a rails application. I can require the gem just fine in a ruby program that I run from the command line using:
require 'nokogiri'
But when I attempt to do the same in one of my rails controllers it errors saying "no such file to load -- nokogiri".
I tried using the full pat...
What's the smartest way to have Nokogiri select all content between the start and the stop element (including start-/stop-element)?
Check example code below to understand what I'm looking for:
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
"<html>
<body>
<p id='para-1'>A</p>
<div clas...
I have an unsorted Array holding the following IDs:
@un_array = ['bar', 'para-3', 'para-2', 'para-7']
Is there a smart way of using Nokogiri (or plain Javascript) to sort the array according to the order of the IDs in the example HTML document below?
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
...
I have the following XML document:
<samlp:LogoutRequest ID="123456789" Version="2.0" IssueInstant="200904051217">
<saml:NameID>@NOT_USED@</saml:NameID>
<samlp:SessionIndex>abcdefg</samlp:SessionIndex>
</samlp:LogoutRequest>
I'd like to get the content of the SessionIndex (that is, 'abcdefg') out of it. I've tried this:
XPATH_QUE...
hi
I want to extract from a webpage all URLs how can I do that with nokogiri?
example:
<div class="heat">
<a href='http://example.org/site/1/'>site 1</a>
<a href='http://example.org/site/2/'>site 2</a>
<a href='http://example.org/site/3/'>site 3</a>
</diV>
result should be an list:
l = ['http://example.org/site/1...
Hi, I have a doubt about nokogiri, I need to get the HTML elements from a page, and get the xpath for each one. The problem is that I can't realize how to do it with nokogiri. The HTML code is random, because I've to parse several pages, from different websites.
...
Hello all,
I have a node which has two children: an XML text and an XML element.
<h1 id="Installation-blahblah">Installation on server<a href="#Installation-blah" class="wiki-anchor">¶</a>
In this case the XML text is:
Installation on server
and the XML element:
<a href="#Installation-blah" class="wiki-anchor">anchor;</...
Hi, I'm trying to add a bunch of html to an existing nodeset, at the top. It mostly works, but the style tags and script tags are getting scrubbed of their content. Here's what I mean:
doc.xpath("//head/*[1]").before("<script>var xb=25</script>")
But if I try to display this, this is what I get:
hdoc.xpath("//head/*[1]")
=> <script><...
I want to use nokogiri to loop through a html and create an object corresponding to every row. I am able to define the root xpaths where I want the data to fill the object varibles comes from but I dont know how to group these as an object.
My code is below. I know it doesn't work but I dont know what direction to go to make it work.
...
Hello all, I'm just beginning with Nokogiri and have a question, hope you guys can help me out:
1) I need to parse a set of xml files (let's say 5 files).
2) Find elements with specific value (for instance, City = "London"), with XPATH.
3) Have a new xml file, with the results of the previous xpath parsing.
...
I decided to give Nokogiri a try, and copied the following program straight from http://nokogiri.rubyforge.org/nokogiri/Nokogiri.html (adding only the require 'rubygems' and the I_KNOW_I_AM_USING_AN_OLD_AND_BUGGY_VERSION_OF_LIBXML2 constant):
require 'rubygems'
I_KNOW_I_AM_USING_AN_OLD_AND_BUGGY_VERSION_OF_LIBXML2 = 1
require 'nokogiri'...
A sample of some oddness:
#!/usr/bin/ruby
require 'rubygems'
require 'open-uri'
require 'nokogiri'
print "without read: ", Nokogiri(open('http://weblog.rubyonrails.org/')).class, "\n"
print "with read: ", Nokogiri(open('http://weblog.rubyonrails.org/').read).class, "\n"
Running this returns:
without read: Nokogiri::XML::Document...
How could I use ruby to extract information from a table consisting of these rows? Is it possible to detect the comments using nokogiri?
EXTRACT LINK 1
EXTRACT DESCRIPTION
EXTRACT LINK 2
Mr P
1
...
I have a document containing ahref links I want to extract. The link I want can be identified by part of the url they link to. There are other links that are similar which I want to discard.
The urls of the links I want are of the format
http://www.xxxxxxxxxxxxxxxxxxx.com/index.php?showtopic=44&hl=
I want to search for links con...
Is there an easy way to convert a Nokogiri XML document to a Hash?
Something like Rails' Hash.from_xml.
...
I have an html element like:
<div id="spam[500]">
I want to search for this element by id, but it seems that nokogiri is getting confused by the []. I'm trying:
doc.css("#spam[#{eggs.id}]")
but to no avail.
...