questions about mechanize | ansaurus

mechanize

How can I perform a HEAD request with the mechanize library?

I know how to do a HEAD request with httplib, but I have to use mechanize for this site. Essentially, what I need to do is grab a value from the header (filename) without actually downloading the file. Any suggestions how I could accomplish this? ...

Errors with Python's mechanize module

Hello, I'm using the mechanize module to execute some web queries from Python. I want my program to be error-resilient and handle all kinds of errors (wrong URLs, 403/404 responsese) gracefully. However, I can't find in mechanize's documentation the errors / exceptions it throws for various errors. I just call it with: self.browser...

Is there a PHP equivalent of Perl's WWW::Mechanize?

I'm looking for a library that has functionality similar to Perl's WWW::Mechanize, but for PHP. Basically, it should allow me to submit HTTP GET and POST requests with a simple syntax, and then parse the resulting page and return in a simple format all forms and their fields, along with all links on the page. I know about CURL, but it's...

screen-scraping

Twill/Mechanize access to html content...

Hello! Couple of questions regarding Twill and Mechanize: 1) Is Twill still relevant as a web-automation tool? If yes, then why is not currently maintained? If no, has Mechanize matured further to support Twill-style simple scripting? Or is there another package that has stepped up to fill the gap? 2) I was able to very quickly set...

Screen scrape web page that displays data page wise using Mechanize

I am trying to screen scrape a web page (using Mechanize) which displays the records in a grid page wise. I am able to read the values displayed in the first page but now need to navigate to the next page to read appropriate values. <tr> <td><span>1</span></td> <td><a href="javascript:__doPostBack('gvw_offices','Page$2')">2</a><...

screen-scraping

Problems installing mechanize gem on Mac OS X 10.4.11

Hi all, I'm trying to install mechanize gem on a MAC OS X but I keep getting the following error : ERROR: Error installing mechanize: ERROR: Failed to build gem native extension. /usr/local/bin/ruby extconf.rb install mechanize checking for #include ... yes checking for #include ... yes checking for #include ... yes checki...

Python mechanize - two buttons of type 'submit'

I have a mechanize script written in python that fills out a web form and is supposed to click on the 'create' button. But there's a problem, the form has two buttons. One for 'add attached file' and one for 'create'. Both are of type 'submit', and the attach button is the first one listed. So when I select the forum and do br.submit(), ...

Mechanize and JavaScript

Hello folks; i'm new with ruby and coding just for fun. i just found WWW::Mechanize and i loved it at the first time, There is my question: i'm connecting to a web site, logging in. website redirects me to new pages and mechanize deals with all cookie and redirection jobs. bu i cant get the last page. i used firebug and did same job ag...

How can I keep WWW::Mechanize from following redirects?

I have a Perl script that uses WWW::Mechanize to read from a file and perform some automated tasks on a website. However, the website uses a 302 redirect after every time I request a certain page. I don't want to be redirected (the page that it redirects to takes too long to respond); I just want to loop through the file and call the f...

is using threads and ruby mechanize safe ?

Does anyone ever see alot of errors like this: Exception `Net::HTTPBadResponse' at /usr/lib/ruby/1.8/net/http.rb:2022 - wrong status line: SOME HTML CODE HERE When using threads and mechanize? I'm relatively certain that this is some bad behavior between threads and the net/http library, but does anyone have any advice as far as the upp...

mechanize html scraping problem

so i am trying to extract the email of my website using ruby mechanize and hpricot. what i a trying to do its loop on all the page of my administration side and parse the pages with hpricot.so far so good. Then I get: Exception `Net::HTTPBadResponse' at /usr/lib/ruby/1.8/net/http.rb:2022 - wrong status line: *SOME HTML CODE HERE* wh...

screen-scraping

How to make mechanize not fail with forms on this page?

import mechanize url = 'http://steamcommunity.com' br=mechanize.Browser(factory=mechanize.RobustFactory()) br.open(url) print br.request print br.form for each in br.forms(): print each print The above code results in: Traceback (most recent call last): File "./mech_test.py", line 12, in <module> for each in br.forms(...

screen-scraping

Scraping Multiple html files to CSV

I am trying to scrape rows off of over 1200 .htm files that are on my hard drive. On my computer they are here 'file:///home/phi/Data/NHL/pl07-08/PL020001.HTM'. These .htm files are sequential from *20001.htm until *21230.htm. My plan is to eventually toss my data in MySQL or SQLite via a spreadsheet app or just straight in if I can get ...

screen-scraping

Multipart File Upload in Ruby

I simply want to upload an image to a server with POST. As simple as this task sounds, there seems to be no simple solution in Ruby. In my application I am using WWW::Mechanize for most things so I wanted to use it for this too, and had a source like this: f = File.new(filename, File::RDWR) reply = agent.post( 'http://rest-test.her...

How to get links on a webpage using mechanize and open those links

Hi, I want to use mechanize with python to get all the links of the page, and then open the links.How can I do it? ...

Can python mechanize handle HTTP auth?

Mechanize (Python) is failing with 401 for me to open http digest URLs. I googled and tried debugging but no success. My code looks like this. import mechanize project = "test" baseurl = "http://trac.somewhere.net" loginurl = "%s/%s/login" % (baseurl, project) b = mechanize.Browser() b.add_password(baseurl, "user", "secret", "some Re...

Caching PHP script outputs on the client side

I have a php script that outputs a random image each time it's called. So when I open the script in a web browser, it shows one image and if I refresh, another image shows up. I'm trying to capture the correct image from visiting the web site through a command line (via mechanize). I used urllib2.urlopen(...) to grab the image, but each...

Mechanize not being installed by easy_install?

I am in the process of migrating from an old Win2K machine to a new and much more powerful Vista 64 bit PC. Most of the migration has gone fairly smoothly - but I did find that I needed to reinstall ALL of my Python related tools. I've downloaded the mechanize-0.1.11.tar.gz file and ran easy_install to install it. This produced C:\Pytho...

Ruby Mechanize getting force_encoding exception

At least sometimes when I navigate to a page I get this exception undefined method `force_encoding' for #<String:0x898922c> Anyone else seen this problem? ...

Greasemonkey-like Firefox plugin for automatic browsing

Is there a plug-in for Firefox that would allow user's Javascript code like Greasemonkey and be able to browse from page to page? I'd like to write a script to: Log in to a website. Follow several links. Make a GET request to that host periodically with given data and time intervals. Make a POST request based on the results of the pre...

1
2
3
4
5
...
11