views:

277

answers:

4

In one of the answers that I have received here, I encountered a problem of not knowing how to pass automatically through "Google App Engines" my ID and a password to a website, on which I am a registered user and have an account. A suggestion was given to me to "check for an HTTP status code of 401, "authorization required", and provide the kind of HTTP authorization (basic, digest, whatever) that the site is asking for". I don't know how to check for status code. Can anyone, please, tell me how to do it?

+++++++++++++++++++++++++++++++++

Additional Information:

If I use this way in Google App Engine (fetching the url of my eBay summary page):

from google.appengine.api import urlfetch
url = "http://my.ebay.com/ws/eBayISAPI.dll?MyEbay&gbh=1&CurrentPage=MyeBaySummary&ssPageName=STRK:ME:LNLK"
result = urlfetch.fetch(url)
if result.status_code == 200:
   print "content-type: text/plain"
   print
   print result.status_code

I always get "200" instead of "401"

+1  A: 

Unless I don't understand fully your question, you can grab the return code from the Response Object using the status_code property.

First, you'll have to issue a fetch() to the URL you want to test.

jldupont
Hello jldupont!!! Thank yu for your response. if I use the way you suggest I always get number 200 and that's all I get. Please check the code above I just added to the main field of this question.
brilliant
hmmm... are you referring to an authenticated request then? You need to provide the information in the header of the request. The `fetch()` function allows setting the header fields.
jldupont
Thank you, jldupont! I'll take some time to research it.
brilliant
@brilliant: my pleasure.
jldupont
jldupont, I tried it, but again ran into a mistake. If you have time and desire, please have a look there: http://stackoverflow.com/questions/1912845/authenticated-request-in-google-app-engine-using-fetch-function-how-to-provide
brilliant
@brilliant: the trace-back is about a syntax error. Could it be related to you using the backtick ` instead of the usual tick ' or " ?
jldupont
@jldupont: Yes, you are right. Alex pointed out this mistake to me too. It has been solved here by placing usual ticks: http://stackoverflow.com/questions/1912845/authenticated-request-in-google-app-engine-using-fetch-function-how-to-provide But, strange, the ID and password don't seem to be passed to the website! And I still get "200". Please check that link.
brilliant
@brilliant: you need to be aware that requests go through a proxy: you might have to dig deeper in the response/headers that you get back in order to troubleshoot.
jldupont
@jldupont: "...you might have to dig deeper in the response/headers that you get back..." - Could you, please, give me some clue as to how I could do it?
brilliant
@brilliant: why don't you list the things you have tried by updating your question?
jldupont
@jldupont: "why don't you list the things you have tried by updating your question?" - Because I thought it would balloon the question page too much and would cause the page to lose its characteristic of being a page that tackles only one particular matter. Also, given the fact that the things that I am trying are slightly different, I thought it would be better to allocate them at different question-pages under different titles making it easier for future users to target them and, therefore, find them.
brilliant
+2  A: 

In ordinary Python code, I'd probably use the lower-level httplib, e.g.:

import httplib

domains = 'google.com gmail.com appspot.com'.split()

for domain in domains:
  conn = httplib.HTTPConnection(domain)
  conn.request('GET', '/')
  resp = conn.getresponse()
  print 'Code %r from %r' % (resp.status, domain)

this will show you such codes as 301 (moved permanently) and 302 (moved temporarily); higher level libraries such as urllib2 would handle such things "behind the scenes" for you, which is handy but makes it harder for you to take control with simplicity (you'd have to install your own "url opener" objects, etc).

In App Engine, you're probably better off using urlfetch, which returns a response object with a status_code attribute. If that attribute is 401, it means that you need to repeat the fetch with the appropriate kind of authorization information in the headers.

However, App Engine now also supports urllib2, so if you're comfortable with using this higher level of abstraction you could delegate the work to it. See here for a tutorial on how to delegate basic authentication to urllib2, and here for a more general tutorial on how basic authentication works (I believe that understanding what's going on at the lower layer of abstraction helps you even if you're using the higher layer!-).

Alex Martelli
Alex, thank for answering again. (1) "...If that attribute is 401, it means that you need to repeat..." - I get "200" all the time (please check the code above I just added to the main field of this question); (2) Thank you for the links. I am studying them at the moment;
brilliant
Alex, I just tried to read through those two links that you have provided here, and it's kind of too overwhelming for me. I think I will stick to AppEngine-urlfetch way.
brilliant
@brilliant, you're getting 200's exactly because urllib2 is doing things "behind the scene" on your behalf; that's handy but makes understanding and control a tad harder. For simple basic auth w/urlfetch (hoping you don't need the more advanced digest auth), see http://chillorb.com/?p=195 (including simpx's comment, it IS needed to make things work;-).
Alex Martelli
Thank you, Alex, for this link. I'll take some time to research it.
brilliant
Hello Alex!!! I just asked another question on using your code here: http://stackoverflow.com/questions/1912845/authenticated-request-in-google-app-engine-using-fetch-function-how-to-provide So, if you have time and willingness, please look it up.
brilliant
+1  A: 

Most user-oriented sites don't use HTTP authentication, preferring instead to use cookie-based authentication, with HTML forms for signin. If you want to duplicate this in your own code, you need to make an HTTP POST request to the login URL for the application in question, and capture the cookie that's sent back, including that in all your future requests to authenticate yourself. Without more details about the specific site you're trying to authenticate against, it's difficult to be more specific.

Nick Johnson
Thank you, Nick, for this input. I'll take some time to look through the materials provided by Alex and then will come back with specifics.
brilliant
+1  A: 

You are not getting 401 because that site is not returning 401 but 200 always. Usually type of coding we do for websites is return 200 with a page saying "Please login..blah blah", if site returned anything other then 200 browser will not display the funky error msg.

So in short as i mentioned in other question, you need to look into login page, see what params it uses e.g login=xxx, password=yyy, post it to that page and you will have to manage the cookies too, that is where library like twill etc come into picture.

Anurag Uniyal
Thank you, Anurag Unival! I kind of was afraid of this possibility of having to deal with cookies, but this answer of yours gives me some hope.
brilliant
Anurag, here is the continuation of what you have suggested. If you have time and desire, please, check it out: http://stackoverflow.com/questions/1914275/googles-app-engine-python-how-to-get-parameters-from-a-log-in-pages
brilliant