views:

414

answers:

2

I'm trying to use mechanize to perform a simple search on my college's class schedule db. The following code returns nil, however it works logging into facebook and searching google (with diff url/params). What am I doing wrong?

I'm following the latest (great) railscast here. Mechanize documentation has been useful but I'm still puzzled. Thanks in advance for your suggestions!

ruby script/console
require 'mechanize'
agent = WWW::Mechanize.new
agent.get("https://www.owens.edu/cgi-bin/class.pl/")
agent.page.forms
form = agent.page.forms.last
form.occ_subject = "chm"
form.submit.search
=> []
A: 

The page returns a null result when it is queried through WWW::Mechanize.

I'm not sure if WWW::Mechanize can handle POSTING to this secure page.

"can't convert nil into String" means it can't show you in a text form what nothing is. It can't convert something from nothing.

It also might be a problem with the form and the script delay.

Try using curl for debugging, POSTing such as curl -d "occ_subject=chm" https://www.owens.edu/cgi-bin/class.pl, when I tried that it returned a page.

I think it's a problem with the secure page and the cgi script combined.

CodeJoust
The correct way to POST a form using CURL is `curl -d "occ_subject=chm" https://www.owens.edu/cgi-bin/class.pl`
Steve Graham
A: 

Remove search from form.submit.search i.e. form.submit I'm guessing you're appending search to submit thinking that it has something to do with the value of the submit button i.e. search.

What you're code is doing IS successfully submitting the form. However you are calling the search method of the resulting page object with a nil argument. The search method expects a selector e.g. 'body div#nav_bar ul.links li' as an argument for it to return an array of elements that match that selector. Of course no elements will match a nil selector, hence the empty array.

Edit per your response:

Your code:

ruby script/console
require 'mechanize'
agent = WWW::Mechanize.new
agent.get("https://www.owens.edu/cgi-bin/class.pl/")
agent.page.forms
form = agent.page.forms.last
form.occ_subject = "chm"
form.submit.search
=> []

What I tried and got to work:

ruby script/console

require 'mechanize'
agent = WWW::Mechanize.new
agent.get("https://www.owens.edu/cgi-bin/class.pl")
agent.page.forms
form = agent.page.forms.last
form.occ_subject = "chm"
form.submit # <- No search method.
=> Insanely long array of HTML elements

The same code will not work with Google either:

require 'mechanize'
require 'nokogiri'
agent = WWW::Mechanize.new
agent.get("http://www.google.com")
form = agent.page.forms.last
form.q = "stackoverflow"
a = form.submit.search
b = form.submit
puts a
=> [] # <--- EMPTY!

puts b
#<WWW::Mechanize::Page
 {url
  #<URI::HTTP:0x1020ea878 URL:http://www.google.co.uk/search?hl=en&amp;source=hp&amp;ie=ISO-8859-1&amp;q=stackoverflow&amp;meta=&gt;}
 {meta}
 {title "stackoverflow - Google Search"}
 {iframes}
 {frames}
 {links
  #<WWW::Mechanize::Page::Link
   "Images"
   "http://images.google.co.uk/images?hl=en&amp;source=hp&amp;q=stackoverflow&amp;um=1&amp;ie=UTF-8&amp;sa=N&amp;tab=wi"&gt;
  #<WWW::Mechanize::Page::Link
   "Videos"
   …

The search method of a page object behaves like the search method of Nokogiri, in that it accepts a sequence of CSS selectors and/or XPath queries and returns an enumerable object of matching elements. e.g.

page.search('h3.r a.l', '//h3/a[@class="l"]')
Steve Graham
Hi. I'm not sure what you mean. The above code, "=>[]", refers to the returned value after I enter submit. How do you pass an argument into search? This code works fine for facebook and google without arguments, why are they necessary here?
JZ
Okay. I'll tentatively accept the answer. However the "Insanely long array of HTML elements", generated is simply the form regenerated (in the case of the college). The returned info following the submit is = agent.page.forms. A real search in the browser the search yields a large list of classes and doesn't agree with this return. I added .search to form.submit because I wasn't sure If I was really submitting the form, or hitting the clear button with the submit command. Obviously, wrong path.Still left with mechanize not returning any valid info on owens. could it be the https, or cgi?
JZ
Personally, I think it's a problem specific to the website. I'm not sure if it is a HTTPS issue, although I'm sceptical that would be the cause as Mechanize is intended to be used as such. Regardless, your problem is vexing me considerably, so I'm going to have a more in depth look to see if I can get to the bottom of it.
Steve Graham