I have the following HTML:
<html>
<body>
<h1>Foo</h1>
<p>The quick brown fox.</p>
<h1>Bar</h1>
<p>Jumps over the lazy dog.</p>
</body>
</html>
...and by using the RubyGem Nokogiri (a hpricot replacement), I'd like to change it into the following HTML:
<html>
<body>
<p class="title">Foo</p>
<p>The quick brown fox.</p>
<p class="title"...
            
           
          
            
            I'd like to figure out a way on how to get to the HTML result (mentioned further below) by using the following Ruby code and the Nokogiri Rubygem:
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
  "<html>
    <body>
      <p id='1'>A</p>
      <p id='2'>B</p>
      <h1>Bla</h1>
      <p id='3'>C</p>
    ...
            
           
          
            
            I use Nokogiri (Rubygem) css search to look for certain <div> inside my html. It looks like Nokogiri's css search doesn't like regex. I would like to switch to Nokogiri's xpath search as this seems to support regex in search strings.
How do I implement the (pseudo) css search mentioned below in an xpath search?
require 'rubygems'
requi...
            
           
          
            
            I'm trying to fill the variables parent_element_h1 and parent_element_h2. Can anyone help me use the Nokogiri Gem to get the information I need into those variables? 
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
  "<html>
    <body>
      <p id='para-1'>A</p>
      <div class='block' id='X1'>
        ...
            
           
          
            
            I am attempting to get a gem I've just installed working in a rails application.  I can require the gem just fine in a ruby program that I run from the command line using:
require 'nokogiri'
But when I attempt to do the same in one of my rails controllers it errors saying "no such file to load -- nokogiri".
I tried using the full pat...
            
           
          
            
            What's the smartest way to have Nokogiri select all content between the start and the stop element (including start-/stop-element)?
Check example code below to understand what I'm looking for:
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
  "<html>
    <body>
      <p id='para-1'>A</p>
      <div clas...
            
           
          
            
            I have an unsorted Array holding the following IDs: 
@un_array = ['bar', 'para-3', 'para-2', 'para-7']
Is there a smart way of using Nokogiri (or plain Javascript) to sort the array according to the order of the IDs in the example HTML document below?
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
  ...
            
           
          
            
            I have the following XML document:
<samlp:LogoutRequest ID="123456789" Version="2.0" IssueInstant="200904051217">
  <saml:NameID>@NOT_USED@</saml:NameID>
  <samlp:SessionIndex>abcdefg</samlp:SessionIndex>
</samlp:LogoutRequest>
I'd like to get the content of the SessionIndex (that is, 'abcdefg') out of it.  I've tried this:
XPATH_QUE...
            
           
          
            
            hi
I want to extract from a webpage all URLs how can I do that with nokogiri?
example:
<div class="heat">
   <a href='http://example.org/site/1/'>site 1</a>
   <a href='http://example.org/site/2/'>site 2</a>
   <a href='http://example.org/site/3/'>site 3</a>
</diV>
result should be an list:
l = ['http://example.org/site/1...
            
           
          
            
            Hi, I have a doubt about nokogiri, I need to get the HTML elements from a page, and get the xpath for each one. The problem is that I can't realize how to do it with nokogiri. The HTML code is random, because I've to parse several pages, from different websites.
...
            
           
          
            
            Hello all,
I have a node which has two children: an XML text and an XML element.
<h1 id="Installation-blahblah">Installation on server<a href="#Installation-blah" class="wiki-anchor">¶</a>
In this case the XML text is:
Installation on server
and the XML element:
   <a href="#Installation-blah" class="wiki-anchor">anchor;</...
            
           
          
            
            Hi, I'm trying to add a bunch of html to an existing nodeset, at the top. It mostly works, but the style tags and script tags are getting scrubbed of their content. Here's what I mean:
doc.xpath("//head/*[1]").before("<script>var xb=25</script>")
But if I try to display this, this is what I get:
hdoc.xpath("//head/*[1]")
=> <script><...
            
           
          
            
            I want to use nokogiri to loop through a html and create an object corresponding to every row. I am able to define the root xpaths where I want the data  to fill the object varibles comes from but I dont know how to group these as an object.
My code is below. I know it doesn't work but I dont know what direction to go to make it work. 
...
            
           
          
            
            Hello all, I'm just beginning with Nokogiri and have a question, hope you guys can help me out:
1) I need to parse a set of xml files (let's say 5 files).
2) Find elements with specific value (for instance, City = "London"), with XPATH.
3) Have a new xml file, with the results of the previous xpath parsing.
...
            
           
          
            
            I decided to give Nokogiri a try, and copied the following program straight from http://nokogiri.rubyforge.org/nokogiri/Nokogiri.html (adding only the require 'rubygems' and the I_KNOW_I_AM_USING_AN_OLD_AND_BUGGY_VERSION_OF_LIBXML2 constant):
require 'rubygems'
I_KNOW_I_AM_USING_AN_OLD_AND_BUGGY_VERSION_OF_LIBXML2 = 1
require 'nokogiri'...
            
           
          
            
            A sample of some oddness:
#!/usr/bin/ruby
require 'rubygems'
require 'open-uri'
require 'nokogiri'
print "without read: ", Nokogiri(open('http://weblog.rubyonrails.org/')).class, "\n"
print "with read:    ", Nokogiri(open('http://weblog.rubyonrails.org/').read).class, "\n"
Running this returns:
without read: Nokogiri::XML::Document...
            
           
          
            
            How could I use ruby to extract information from a table consisting of these rows? Is it possible to detect the comments using nokogiri?
 
     
      
       
      
       
        
 
         
         
         EXTRACT LINK 1   
         
        
       
      EXTRACT DESCRIPTION 
      
     EXTRACT LINK 2 
     Mr P 
     1 
  ...
            
           
          
            
            I have a document containing ahref links I want to extract. The link I want can be identified by part of  the url they link to. There are other links that are similar which I want to discard.
The urls  of the links I want are of the format
http://www.xxxxxxxxxxxxxxxxxxx.com/index.php?showtopic=44&hl=
I want to search for links con...
            
           
          
            
            Is there an easy way to convert a Nokogiri XML document to a Hash?
Something like Rails' Hash.from_xml.
...
            
           
          
            
            I have an html element like:
<div id="spam[500]">
I want to search for this element by id, but it seems that nokogiri is getting confused by the []. I'm trying:
doc.css("#spam[#{eggs.id}]")
but to no avail.
...