views:

457

answers:

4

Here is a quote from here:

So in short ... you need to look into login page, see what params it uses e.g login=xxx, password=yyy, post it to that page and you will have to manage the cookies too, that is where library like twill etc come into picture.

How could I do it using Python and Google App Engine? Can anybody please give me some clue? I have already asked a question about the authenticated request, but here it seems the matter is different as here I am suggested to look into login page and get parameters, and also I have to deal with cookies.

+1  A: 

This is not app engine or python specific. You need to get familiar with how POST and GET work. When you log into a typical web site, your browser is sending a POST to the web server, with a bunch of parameters. You can see what the parameters are called by viewing the source of the web page in question, and looking for the login form. Once you know the names of the parameters, you can include them in your POST to the web site. The web site will then return back a cookie, that would normally be stored in your browser. Since you are trying to simulate a browser, you would need to store this cookie yourself, and send it along when you try to request further pages from that particular site.

Peter Recore
@Peter: Thank you, Peter. "Since you are trying to simulate a browser, you would need to store this cookie yourself" - but...is it possible to be done on "Google App Engines"?
brilliant
+1  A: 

I am not sure if I understood your question, but if you want the GET parameters, with webapp, it would be something like this:

login = self.request.get('login')
password = self.request.get('password')

More information on dealing with forms is available here

You should also try the user service if want a quick way to authenticate your users.

jbochi
he is not talking about login on server side, but login to e.g. to yahoo from a python script
Anurag Uniyal
Sorry. My answer is useless then.
jbochi
@jbochi: It's my fault - I didn't describe my question here fully. Thanks for participating anyway.
brilliant
@Anurag Unival: "he is not talking about login on server side" - Am I getting something wrong here? Because I've always thought that what I am attempting to do here (logging-in from "Google App Engines" using a Python script there) was equal to logging-in on server side.
brilliant
@brilliant, you may be using such login script at server side, but it has nothing to do with server, that would be normal script able to run from anywhere, I think jbochi was trying to explain how to write a server side form for login, which you do not want IMO
Anurag Uniyal
@Aaaah, I see! Thank you.
brilliant
+2  A: 

There are two ways

  1. AS I told you use twill or mechanize, as twill is just a simple wrapper over mechanize you may just use mechanize(http://wwwsearch.sourceforge.net/mechanize/), but to use mechanize you may need to do some hacking see http://stackoverflow.com/questions/275980/import-mechanize-module-to-python-script for more details

  2. Do it the hard way and learn something while doing that Lets see how to login to yahoo

a) look into the page (https://login.yahoo.com/config/login%5Fverify2?&.src=ym) and see what does form look like, you can firebug to inspect instead of looking into raw html.

b) form has login and passwd two field, Plus some more hidden fields lets ignore them for now, so till now we have form action url= "https://login.yahoo.com/config/login?" form_data = {'login' : 'my_login', 'passwd' : 'my_passwd'}

c) we can post above data to the correct post url, and it may work but usually we will need to go to other pages and if we do not have cookie it will ask again for login. so lets use a cookie jar e.g.

jar = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
form_data = urllib.urlencode(form_data)
# data returned from this pages contains redirection
resp = opener.open(url, form_data)

d) now the page from yahoo, redirects to other pages, e.g. if I want to see mail page, i will now go to that and cookies will take care of authentication e.g.

resp = opener.open('http://mail.yahoo.com')
print resp.read()

If you see printout it says , "xxxx| logout , Hmm... your browser is not officially supported." that means it has logged me in :), but as yahoo mail is a ajax page and doesn't support my simple scripting browser, we can get past this tool by spoofing browser type, and can do lots of stuff.

Here is the final code

import urllib, urllib2, cookielib

url = "https://login.yahoo.com/config/login?"
form_data = {'login' : 'your-login', 'passwd' : 'your-pass'}

jar = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
form_data = urllib.urlencode(form_data)
# data returned from this pages contains redirection
resp = opener.open(url, form_data)
# yahoo redirects to http://my.yahoo.com, so lets go there insetad
resp = opener.open('http://mail.yahoo.com')
print resp.read()

You should look into mechanzie code or links like this http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=cookielib%5Fexample.py to see how they do it.

we can post this data

Anurag Uniyal
WOW!!!!Anurag Uniyal, Thank you VERY VERY MUCH for this!!! I am going to dive into all these materials provided by you!!!Thanks for spending time on typing all of this.
brilliant
A: 

@Brilliant and @Anurag Uniyal I am having trouble maintaining cookies with app engine. When I attempt the line:

resp = opener.open('http://mail.yahoo.com')

I am being directed to mail.yahoo.com and I am not logged in.

I tried changing the code to this, so that yahoo redirects me automatically but I am getting this message: The browser you're using refuses to sign in. (cookies rejected):

url = "https://login.yahoo.com/config/login_verify2?&.src=ym" form_data = {'login' : 'astaubbie', 'passwd' : 'tdavis'} jar = cookielib.CookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar)) form_data = urllib.urlencode(form_data) resp = opener.open(url, form_data) data = resp.read() full_soup = BeautifulSoup(data)

Any ideas?

Andrew