views:

62

answers:

2

I have asked this question here about a Python command that fetches a URL of a web page and stores it in a variable. The first thing that I wanted to know then was whether or not the variable in this code contains the HTML code of a web-page:

from google.appengine.api import urlfetch
url = "http://www.google.com/"
result = urlfetch.fetch(url)
if result.status_code == 200:
doSomethingWithResult(result.content)

The answer that I received was "yes", i.e. the variable "result" in the code did contain the HTML code of a web page, and the programmer who was answering said that I needed to "check the Content-Type header and verify that it's either text/html or application/xhtml+xml". I've looked through several Python tutorials, but couldn't find anything about headers. So my question is where is this Content-Type header located and how can I check it? Could I send the content of that variable directly to my mailbox?

Here is where I got this code. It's on Google App Engines.

+1  A: 

for info on sending Content-Type header, see here:
http://code.google.com/appengine/docs/python/urlfetch/overview.html#Request%5FHeaders

Corey Goldberg
Thanks for this link
brilliant
+1  A: 

If you look at the Google App Engine documentation for the response object, the result of urlfetch.fetch() contains the member headers which contains the HTTP response headers, as a mapping of names to values. So, all you probably need to do is:

if result['Content-Type'] in ('text/html', 'application/xhtml+xml'):
    # assuming you want to do something with the content
    doSomethingWithXHTML(result.content)
else:
    # use content for something else
    doTheOtherThing(result.content)

As far as emailing the variable's contents, I suggest the Python email module.

BenHayden
Thank you very much, extgreen!
brilliant