I'm trying to use Scrubyt to get the details from this page http://www.nuffieldtheatre.co.uk/cn/events/event_listings.php?section=events. I've managed to get the titles and detail URLs from the list, but I can't use next_page to get the scraper to go to the next page. I assume that's cause I'm not using the correct pattern for the next p...
I'm trying to use Scrubyt to navigate around a website, but whenever I use it to click any links it gives me 403 Forbidden errors. The website doesn't require logins or anything so I don't understand this. Might it need some kind of session variable, or the right UserAgent string. Any idea how I might fix this?
...
This might be a similar problem to my earlier two questions - see here and here but I'm trying to use the _detail command to automatically click the link so I can scrape the details page for each individual event.
The code I'm using is:
require 'rubygems'
require 'scrubyt'
nuffield_data = Scrubyt::Extractor.define do
fetch 'http://w...
I'm trying to transition this bit of code from scrubyt to nokogiri, and am stuck trying to write my results to either a hash or xml. In scrubyt it looks like the following:
require 'rubygems'
require 'scrubyt'
result_data = Scrubyt::Extractor.define do
fetch "http://rads.stackoverflow.com/amzn/click/0061673730"
results "//d...
I'm having problems deciding between hpricot and scrubyt and I was wondering if someone who has worked with them could provide an advantages/disadvantages list for each.
...
Hi, I've come across an issue which unfortunately I can't seem to surpass, I'm also just a newborn to Ruby on rails unfortunately hence the number of questions
I am attempting to scrape a webpage such as the following:
http://www.yellowpages.com.mt/Malta/Grocers-Mini-Markets-Retail-In-Malta-Gozo.aspx
I would like to scrape The Addres...
How do I target one form from another when there are 2 forms in the same page like this page?
http://screener.finance.yahoo.com/stocks.html
Here's my sample code:
require 'rubygems'
require 'scrubyt'
extractor = Scrubyt::Extractor.define do
fetch 'http://screener.finance.yahoo.com/stocks.html'
select_option('prmin', '5')
select...
Does anyone know of a way to get fill_textfield to accept a big5-encoded string in the query_field? I keep getting an "unterminated string meets end of file" error with this:
require 'rubygems'
require 'scrubyt'
search_data = Scrubyt::Extractor.define do
fetch 'http://www.google.com/ncr'
fill_textfield 'q', '你好世界'
submit
end
...
I've written a scrubyt extractor based on the 'learning' technique - that is, specifying the current text on the page and getting it to work out the XPath expressions itself. However, I now want to export the extractor so that it can be used even when the page has changed.
The documentation for scrubyt seems to be all over the place now...
I can't seem to get a page to load with scrubyt and I think its because the page I am navigating to checks the referer. Is it possible to set the referer on the fetch action?
...
I'm trying to use scrubyt to scrape a page and have everything working except for a decent way of advancing to the next page of the results. The next_page approach isn't working due to the url being relative.
I figured out a simple way to do it but it all hinges on being able to use something like:
if node_exists("//div[@class='pagina...
I am by no means a master with Ruby and am quite new to Scrubyt. I was just trying out some examples found on there wiki page. The example i was working on was getting the search results returned by Google when you search for 'ruby' and I had the idea of grabbing the URL of each result so I could go ahead and fetch that page as well. The...
Hello all. I'm trying to scrape the the Yellow Pages website. Specifically, this link http://www.yellowpages.com/santa-barbara-ca/restaurants. My code works perfectly except for one small problem. Because the "Next" link to go to the next page of restaurants is a relative link, Scrubyt's "next_page" function doesn't work...apparently...