views:

157

answers:

2

Hello,

I've been trying to pass my login and password from Python script to the eBay sign-in page. Later I want this script to be run from "Google App Engine"

I was suggested to use "mechanize". Unfortunately, it didn't work for me:


IDLE 1.2.4      
>>> import re
>>> import mechanize
>>> br = mechanize.Browser()
>>> br.open("https://signin.ebay.com")

Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
    br.open("https://signin.ebay.com")
  File "build\bdist.win32\egg\mechanize\_mechanize.py", line 203, in open
    return self._mech_open(url, data, timeout=timeout)
  File "build\bdist.win32\egg\mechanize\_mechanize.py", line 255, in _mech_open
    raise response
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt
>>> 

Earlier I was trying to use Python and twill - it didn't work either until one supporter suggested that I download the latest version of mechanize and then perform the following steps:

  1. Locate the following folder on my computer: "C:\Python25\Lib\site-packages\twill\other_packages\_mechanize_dist"

  2. Change its name to "_mechanize_dist_backup" (the full path, thus, should be "C:\Python25\Lib\site-packages\twill\other_packages\_mechanize_dist_backup")

  3. Copy the "mechanize" folder (which is located in "mechanize-0.2.2" - the folder that I had downloaded and unzipped from the "mechanize" official site) and paste it in "C:\Python25\Lib\site-packages\twill\other_packages" (the full path, thus, being "C:\Python25\Lib\site-packages\twill\other_packages\mechanize")

  4. Change its name to "_mechanize_dist" (the full path being "C:\Python25\Lib\site-packages\twill\other_packages_mechanize_dist")

  5. Copy "ClientForm" file from "_mechanize_dist_backup" and paste it in "_mechanize_dist" (in fact, I found there two files named "ClientForm": one is a python file, another one is a compiled python file - I copied and pasted both of them).

After I had performed all these steps, I tried to log in to my eBay account from the twill shell in Python and it worked!!! I could even log in to my Yahoo mail box in the same way and check my mails!

But now I have a dilemma: I don't know how I could deploy my script to "Google App Engine".

Earlier I had been advised that if I want to use third-party libraries in App Engine projects, I simply have to include them with my application when I deploy it - in case with twill, for example, I just need to copy the twill folder into my application's folder and deploy it.

But now not only do I have this twill folder as a third-party library to be included, but also all these changes performed in "C:\Python25" (in "C:\Python25\Lib\site-packages\twill\other_packages", to be precise) while my application folder - the one in which I have my script ("my_script.py" file) - is located on "E" disk.

Can anybody, please, give me some suggestions here?

+2  A: 

The error message is indicating that mechanize is obeying the site's robots.txt file for you.

You should use eBay's API if you want to access their site in an automated way. If you don't, and build your own solution that ignores robots.txt, don't be surprised when they block you, and complain to Google about automated queries coming from App Engine from your app.

Wooble
Hello, Wooble!!! But I am quite puzzled here - why mechanized "didn't obey" in the second case? BTW, if You look into the contents of eBay robot.txt (https://signin.ebay.com/robots.txt), You will see that they DO allow automated access within certain limits: "eBay may permit automated access to access certain eBay pages but solely for the limited purpose of including content in publicly available search engines" - I am not planning to go beyond those limits.
brilliant
+1  A: 

As for GAE deployment issue, @brilliant, looks like the code you're dealing is all pure python 2.5 (the only really blocking issue would be if it isn't -- no binary extensions allowed, no code requiring Python 2.6 or better allowed, and that's just the way it is on GAE at this time).

So, under this assumption, the only issue w/deploying the code on App Engine is having all the code, NOT in site-packages (from which of course GAE's dev_appserver.py deploys absolutely nothing, nada, zilch), but rather in your GAE project's directory (I suggest a recursive zip of all the .py files, only -- remove all the .pyc files, in particular, before you zip -r it;-).

All in all, it's just a question of a couple of appropriate shell commands: cp -R then zip -r (probably harder on non-unixy shells, but, hey, even on Windows you can do it with bash from cygwin... in any case, it's hardly a "development" issue, per se;-).

Alex Martelli
Thank You, Alex, for this answer. I am trying it out at the moment.
brilliant
Hello, Alex!!! Sorry for the not answering for so long. So all these commands like "cp-R" and "zip -r" are all related only to Unix, right? And those folks on Winows, like me, can't use them, right?. I have no idea what "bash from cygwin" means. Can You just imagine! I just uninstalled Python and then installed Python, then twill, then mechanize, then performed all those 5 steps (described above) and... alas! I couldn't log in to either Yahoo or eBay anymore! I just wanted to repeat everything from scratch to make sure I remembered every detail and obviously deleted some very important (↙)
brilliant
file or, perhaps, even forgot some very important step. So now I am reviewing all my recent posts and am trying to restore what was a kind of success for me. I can't believe I did such a stupid thing - I should have never uninstalled Python!
brilliant
@brilliant, to use most unix-y shell commands in Windows, install the free third-party package named `cygwin` --- see http://www.cygwin.com/ .
Alex Martelli
I see, Alex. Thank You for this link. You've already helped me so much!!! I am absolutely indebted to You. At the moment I am still struggling trying to restore what I have achieved.
brilliant
@brilliant, you're welcome!
Alex Martelli
Alex, it already becomes funny. It seems that GAE simply doesn't like twill - no matter whether it's pure python 2.5 or not. If You have time and desire, please, check it out: http://stackoverflow.com/questions/3744141/does-gae-accept-twill-at-all
brilliant