views:

90

answers:

3

When I run this code on my computer with the help of "Google App Engine SDK", it displays (in my browser) the HTML code of the Google home page:

from google.appengine.api import urlfetch
url = "http://www.google.com/"
result = urlfetch.fetch(url)
print result.content 

How can I make it display the page itself? I mean I want to see that page in my browser the way it would normally be seen by any user of the internet.


Update 1:

I see I have received a few questions that look a bit complicated to me, although I definitely remember I was able to do it, and it was very simple, except i don't remember what exactly i changed then in this code.

Perhaps, I didn't give You all enough details on how I run this code and where I found it. So, let me tell You what I did. I only installed Python 2.5 on my computer and then downloaded "Google App Engine SDK" and installed it, too. Following the instructions on "GAE" page (http://code.google.com/appengine/docs/python/gettingstarted/helloworld.html) I created a directory and named it “My_test”, then I created a “my_test.py” in it containing that small piece of the code that I mentioned in my question.

Then, continuing to follow on the said instructions, I created an “app.yaml” file in it, in which my “my_test.py” file was mentioned. After that in “Google App Engine Launcher” I found “My_test” directory and clicked on Run button, and then on Browse. Then, having visited this URL http://localhost:8080/ in my web browser, I saw the results.

I definitely remember I was able to display any page in my browser in this way, and it was very simple, except I don’t remember what exactly I changed in the code (it was a slight change). Now, all I can see is a raw HTML code of a page, but not a page itself.


Update 2:

(this update is my response to wescpy)

Hello, wescpy!!! I've tried Your updated code and something didn't work well there. Perhaps, it's because I am not using a certain framework that I am supposed to use for this code. Please, take a look at this screen shot (I guess You'll need to right-click this image to see it in better resolution): alt text

A: 

special characters such as <> etc are likely encoded, you'd have to decode them again for the browser to interpet it as code.

Jonas B
Thank You, Jonas, for telling me that, but I remember I somehow did that and I didn't have any problems with special characters then - perhaps, I just was lucky enough not to have stumbled upon such problems. Please refer to "Update 1" section in my question to see the details. Thank You.
brilliant
Wrong - nothing here is doing entity encoding.
Nick Johnson
Ok, sorry I couldn't help :)
Jonas B
+1  A: 

Is not that easy, you have to parse content and adjust relative to absolute paths for images and javascripts.
Anyway, give it a try adding the correct Content-Type:

from google.appengine.api import urlfetch
url = "http://www.google.com/"
result = urlfetch.fetch(url)
print 'Content-Type: text/html'
print ''
print result.content
systempuntoout
Hello, systempuntoout!!! Thank You for this input, but I tried it and the result is still the same - all I can see is a raw HTML code of a page, but not a page itself. Perhaps, I wasn't descriptive enough in terms of specifics on how I run this code and where I found it. Please refer to "Update 1" section in my question to see the details. Thank You.
brilliant
You forgot the blank line between the headers and the content. This is why using a framework is a good idea 99.99% of the time.
Nick Johnson
@Nick thanks, corrected.
systempuntoout
@Nick: Thank You, Nick, it works!!!
brilliant
@systempuntoout: Yes,systempuntoout, that's exactly how I did it before, thank You for bringing this code back to my memory and thank You for Your time!!!
brilliant
@brilliant you are welcome
systempuntoout
+1  A: 

a more complete example would look something like this:

from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.api import urlfetch

class MainHandler(webapp.RequestHandler):
    def get(self):
       url = "http://www.google.com/"
       result = urlfetch.fetch(url)
       self.response.out.write(result.content)

application = webapp.WSGIApplication([
    ('/', MainHandler),
], debug=True)

def main():
    run_wsgi_app(application)

if __name__ == '__main__':
    main()

but as others' have said, it's not that easy to do because you're not in the server's domain, meaning the pages will likely not look correct due to missing static content (JS, CSS, and/or images)... unless full pathnames are used or everything that's needed is embedded into the page itself.

UPDATE 1:

as mentioned before, you cannot just download the HTML source and expect things to render correctly because you don't necessarily have access to the static data. if you really want to render it as it was meant to be seen, you have to just redirect... here's the modified piece of code:

from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.api import urlfetch

class MainHandler(webapp.RequestHandler):
    def get(self):
       url = "http://www.google.com/"
       self.redirect(url)

application = webapp.WSGIApplication([
    ('/', MainHandler),
], debug=True)

def main():
    run_wsgi_app(application)

if __name__ == '__main__':
    main()

UPDATE 2:

sorry! it was a cut-n-paste error. now try it.

wescpy
Hello, wescpy!!!! Thank You for this code, but I definitely remember I did it somehow before and it was very simple - the code was only slightly different from the one mentioned in my question. Please refer to "Update 1" section in my question to see the details. Thank You.
brilliant
Hello, wescpy!!!! I've tried Your updated code, but still something doesn't go right. Is it because I am not using any framework? If You have time and desire, please, check out "Update 2" section in my question. Thank You.
brilliant
Yes, wescpy!!! Now, after your update 2, it works!!! Thank you very much!!!
brilliant