I am trying to install hpricot using the command:
>gem install hpricot -v 0.8.2
Building native extensions. This could take a while...
ERROR: Error installing hpricot:
ERROR: Failed to build gem native extension.
C:/Ruby19/bin/ruby.exe extconf.rb
checking for stdio.h... * extconf.rb failed *
Could not create Makefile due to some ...
Hi, I've come across an issue which unfortunately I can't seem to surpass, I'm also just a newborn to Ruby on rails unfortunately hence the number of questions
I am attempting to scrape a webpage such as the following:
http://www.yellowpages.com.mt/Malta/Grocers-Mini-Markets-Retail-In-Malta-Gozo.aspx
I would like to scrape The Addres...
hi
I am trying to use hpricot in JRuby.
My problem is the following. If I have this code:
#!ruby
require 'hpricot'
require 'open-uri'
# load the RedHanded home page
doc = Hpricot(open("http://redhanded.hobix.com/index.html"))
where do I put it?
Into my controller? Because its not accepting it there.
And if I'm supposed to put it...
Hi
I am trying to use hpricot in a controller. I would like to pass this value to a html.erb page so I can display it on the screen
So I wrote this:
session[:allcars] = (doc/"td.car_title/text()")
but this gives an error
when I tried this:
puts (doc/"td.car_title/text()")
this printed the cars into the console.
So I can't under...
Using apricot, it is pretty easy to see how I can extract all elements with a given id or class using a CSS Selector. Is it possible to extract elements from a document based on whether some attribute of those elements matches against some regular expression?
...
My wife enjoys it when I use my geek abilities to be "romantic" so I had an idea for a ruby script to install on her Mac that would send her quotes and little notes from me throughout the day.
I already figured out that I'll be using GeekTool to run a script in the background and I'll use growlnotify to display the messages.
Now what I n...
I am going to be using Hpricot to process an XML file. I want to randomly display some quotes from the file, and then I want to keep track of how often each quote has been displayed.
Is it possible for me to update a single item within the XML file using Hpricot (or is there some other solution that can do this for me?) or should I just...
is there a way to load a chunk of html into an Hpricot::Doc object?
I am trying to parse various chunks of html within custom tags from a page.
so if I have:
<foo>
<b>here is some stuff</b>
<table>
<tr>
<td>one</td>
<td>two</td>
</tr>
<tr>
<td>three</td>
<td><four</td>
</tr>
</table>
</foo...
The following hpricot code successfully extracts the STPeriods in the XML on two of my machines (Vista and an Ubuntu server) but fails on another Ubuntu laptop. All machines have Hpricot v0.82
Any ideas? Totally stumped.
Hpricot code:
(doc/"WeatherFeed/Location/WxShortTerm/STPeriod").each do |ham_forecast|
XML file
<?xml version=...
I'm trying to use Hpricot to get the value within a span with a class name I don't know. I know that it follows the pattern "foo_[several digits]_bar".
Right now, I'm getting the entire containing element as a string and using a regex to parse the string for the tag. That solution works, but it seems really ugly.
doc = Hpricot(open("ht...
I want to remove a list of dom events attribute from html? how to do this? like:
before = "<div onclick="abc" >abc</div>"
after = clean_it(before) // after => "<div>abc</div>"
DOM_EVENT_TO_BE_REMOVE = "onclick|ondblclick|onerror|onfocus|onkeydown" // i want to remove these events
// i want to do it like this
def clean_it(html...
I'm working on a ruby script to grab historical stock prices from Yahoo, using Hpricot to parse the pages. This is mostly straighforward: the url is "http://finance.yahoo.com/q/hp?s=TickerSymbol" For example, to look up Google, I would use "http://finance.yahoo.com/q/hp?s=GOOG"
Unfortunately, it breaks down when I'm looking up the price...
I have been expeirmenting with Watir, Nokogir and Hpricot. All of these use top->down approach which is my problem. i.e. they use element type to search element. I want to find out the element using the text without knowing element type.
e.g.
<element1>
<element2> Text2 </element2>
<element3> Text3 </element3>
text4
</elem...
If I have an empty tag:
<tag/>
How can I add text so that I end up with:
<tag>Hello World!</tag>
I can only seem to swap the whole tag with different content or add content before/after it.
...
I use hpricot gem in ruby on rails to parse a webpage and extract the meta-tag contents. But if the website has a <noscrpit> tag just after the <head> tag it throws an exception
Exception: undefined method `[]' for nil:NilClass
I even tried to update the gem to the latest version. but still the same.
this is the sample code i use.
r...
Hpricot(html).inner_text.gsub("\r"," ").gsub("\n"," ").split(" ").join(" ")
hpricot = Hpricot(html)
hpricot.search("script").remove
hpricot.search("link").remove
hpricot.search("meta").remove
hpricot.search("style").remove
found it on http://www.savedmyday.com/2008/04/25/how-to-extract-text-from-html-using-rubyhpricot/
...
Hey,
I am trying to write a CSS selector that select everything except the script elements with hpricot, I can easily select the all the contents of the select-me div and then remove the script elements but I was wondering if its possible to use a selector which will exclude the script elements:
<div class='select-me'>
<p>This is some ...
I've just noticed that a lot of hpricot code is written in java...
I heard that JRuby performed a lot better than native ruby when processing regular expression. Is maybe the java classes just activated if JRuby or Java is installed and the ruby used if these are not found?
It's something puzzling indeed.
Thanks
...
I am using Hpricot to parse a theme file. I have noticed, however, that if I feed a valid HTML5 document into Hpricot(), it auto-closes HTML5 tags (like <section>), and messes with the DOCTYPE.
Are there any extensions to Hpricot, or perhaps a flag I need to set, that will allow HTML5 documents to be parsed correctly?
...
Which one would you choose? My important attributes are (not in order)
Support & Future enhancements
Community & general knowledge
base (on the Internet)
Comprehensive (i.e proven to
parse a wide range of *.*ml pages)
Performance
Memory Footprint (runtime, not the code-base)
...