sudo gem install ruby-debug
This will give you access to a nice ruby debugger, start the debugger by altering your script:
require 'rubygems'
require 'ruby-debug'
Debugger.start
Debugger.settings[:autoeval] = true if Debugger.respond_to?(:settings)
require 'scrubyt'
nuffield_data = Scrubyt::Extractor.define do
fetch 'http://www.nuffieldtheatre.co.uk/cn/events/event_listings.php'
event do
title 'The Coast of Mayo'
link_url
event_detail do
dates "1-4 October"
times "7:30pm"
end
end
next_page "Next Page", :limit => 2
end
nuffield_data.to_xml.write($stdout,1)
Then find out where scrubyt is throwing an exception - in this case:
/Library/Ruby/Gems/1.8/gems/scrubyt-0.3.4/lib/scrubyt/core/navigation/fetch_action.rb:52:in `fetch'
Find the scrubyt gem on your system, and add a rescue clause to the method in question so that the end of the method looks like this:
if @@current_doc_protocol == 'file'
@@hpricot_doc = Hpricot(PreFilterDocument.br_to_newline(open(@@current_doc_url).read))
else
@@hpricot_doc = Hpricot(PreFilterDocument.br_to_newline(@@mechanize_doc.body))
store_host_name(self.get_current_doc_url) # in case we're on a new host
end
rescue
debugger
self # the self is here because debugger doesn't like being at the end of a method
end
Now run the script again and you should be dropped into a debugger when the exception is raised. Just try typing this a the debug prompt to see what the offending URL is:
@@current_doc_url
You can also add a debugger statement anywhere in that method if you want to check what is going on - for example you may want to add one between line 51 and 52 of this method to check how the url that is being called changes and why.
This is basically how I figured out the answer to your previous questions.
Good luck.