questions about mechanize | ansaurus

mechanize

python urllib post question

hello ALL im making some simple python post script but it not working well. there is 2 part to have to login. first login is using 'http://mybuddy.buddybuddy.co.kr/userinfo/UserInfo.asp' this one. and second login is using 'http://user.buddybuddy.co.kr/usercheck/UserCheckPWExec.asp' i can login first login page, but i couldn't login...

Ruby Mechanize: user agents?

How many user agents are there in Mechanize? Is there a handy list of all the user agent options anywhere? ...

Can Mechanize make Javascript calls?

Can Mechanize make Javascript calls? This would be handy to negotiate AJAX when screen-scraping... ...

screen-scraping

WWW::Mechanize::GZip triggering DIE signal...why?

It's taken me a while to track down a sudden problem with my code, but it appears that WWW::Mechanize::GZip is somehow triggering my $SIG{DIE} handler. Consider this code: use strict; use WWW::Mechanize::GZip; $SIG{__DIE__} = sub { print "WTF??? WHY IS THIS BEING TRIGGERED?\n"; }; my $mech = WWW::Mechanize::GZip->new(); $mech->ge...

Get string value from http response with Mechanize

Hi, I'm currently integrating facebook into my current app and I've succeeded in retrieving the access_token using the following code: url="#{url}?#{client_id}&#{client_secret}&#{code}&#{redirect_uri}&type=client_cred" agent = Mechanize.new page = agent.get(url) The page object above has a body which contains text something along t...

Using Mechanize with Google Docs

I'm trying to use Mechanize login to Google Docs so that I can scrape something (not possible from the API) but I keep seem to keep getting a 404 when trying to follow the meta redirect: require 'rubygems' require 'mechanize' USERNAME = "..." PASSWORD = "..." LOGIN_URL = "https://www.google.com/accounts/Login?hl=en&continue=http:/...

screen-scraping

Environment variables

I use the module mechanize in order to log in a site. When I import twill.commands without any other apparent use, some debug messages [0] are displayed [1]. When I delete it, these messages disappear. How can I see what is changed in the environment in order to emulate it and remove this dependency? [0] Using the logging module. [1] M...

Other options for accessing input fields with Ruby Mechanize?

According to the documentation: "Mechanize lets you access form input fields in a few different ways". But I can only see one way using accessors. What other options are there? For example: can you reference form field parts like "Mechanize::Form::Text:0x101698168" instead of having to use the name value. ...

screen-scraping

Rails "Missing these required gems" error for installed gems

I know this has been asked multiple times before, but I've tried those things and still am not having any luck. For the mechanize gem, I keep getting the "Missing these required gems" error when I run db:migrate on my production server. Here's the full error: Missing these required gems: mechanize You're running: ruby 1.8.6.111...

Can a formfield be selected w/mechanize based on the type of the field (eg. TextControl, TextareaControl)?

I'm trying to parse an html form using mechanize. The form itself has an arbitrary number of hidden fields and the field names and id's are randomly generated so I have no obvious way to directly select them. Clearly using a name or id is out, and due to the random number of hidden fields I cannot select them based on the sequence number...

How to add an attribute to mechanize's(ruby) Field class and have it use my extended class.

Basically, when I store a form object I want to some extra analysis on the fields and set a new attribute based on that analysis. Should I just go inside the code and add the attribute manually, or is can I just extend it? How do I get mechanize to use my new class when it gets these fields? Thanks! ...

return <options> as list from <select> box with python (mechanize/twill)

If I were to get something like this with showforms(), how would I get the Values out of the SOME_CODE input box? Form name=ttform (#2) ## ## __Name__________________ __Type___ __ID________ __Value__________________ 1 NUMBER select (None) ['0'] of ['0', '10', '2', '3', '4', ... 2 SOMEYEAR ...

Interact with Flash using Python Mechanize

I am trying to create an automated program in Python that deals with Flash. Right now I am using Python Mechanize, which is great for filling forms, but when it comes to flash I don't know what to do. Does anyone know how I can interact with flash forms (set and get variables, click buttons, etc.) via Python mechanize or some other pytho...

Perl Mechanize, submitting a form with a file (image)?

I can't seem to find a good example of how to do this properly, the ones I have found aren't working for me.. I am trying to submit a form using perl mechanize, where the form has an image file, the form is as below, its actually a way I am trying to access this API for a website from which I have an account and using POST seems to be th...

How do I get Python's Mechanize to POST an ajax request?

The site I'm trying to spider is using the javascript: request.open("POST", url, true); To pull in extra information over ajax that I need to spider. I've tried various permutations of: r = mechanize.urlopen("https://site.tld/dir/" + url, urllib.urlencode({'none' : 'none'})) to get Mechanize to get the page but it always results in...

Nokogiri: Parsing Irregular "<"

I am trying to use nokogiri to parse the following segment <tr> <th>Total Weight</th> <td>< 1 g</td> <td style="text-align: right">0 %</td> </tr> <tr><td class="skinny_black_bar" colspan="3"></td></tr> However, I think the "<" sign in "< 1 g" is causing Nokogiri problems. Does anyone know any workarounds? Is there a...

Nokogiri response different

Does anyone have a problem with Nokogiri acting differently between two servers (staging, and production)? On staging, it grabs and return the page properly (Nokogiri 1.4.2 Mechanize 1.0.0) On production, it returns a much smaller set of html that looks like a canned message (Nokogiri 1.4.2 Mechanize 1.0.0) I found out by running it i...

Python - The request headers for mechanize

I am looking for a way to view the request (not response) headers, specifically what browser mechanize claims to be. Also how would I go about manipulating them, eg setting another browser? Example: import mechanize browser = mechanize.Browser() # Now I want to make a request to eg example.com with custom headers using browser The pu...

Mechanize does not see some hidden form inputs?

I want to scrape this web page using Mechanize. The form element looks like this: <form name="ctl00" method="post" action="PSearchResults.aspx?state=ME&rp=" id="ctl00"> <div> <input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" /> <input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" /> <inpu...

Why can't my Perl script print cookie values?

When I visit usatoday.com with IE, there're cookie files automatically created in my Temporary Internet Files folder. But why doesn't the following Perl script capture anything? use WWW::Mechanize; use strict; use warnings; my $browser = WWW::Mechanize->new(); my $response = $browser->get( 'http://www.usatoday.com' ); my $cookie_jar = ...

1
...
7
8
9
10
11