ansaurus

Question

Get webpage contents with Python?

Answer 1

+1 A:

You can use urlib2 and parse the HTML yourself.

Or try Beautiful Soup to do some of the parsing for you.

JasDev 2009-12-03 22:29:56

Tried urllib2 and urllib, but neither worked. (Edited first post)

Andrew 2009-12-03 22:32:24

Andrew, others can help you better if you describe in detail what you tried and what error message(s) / unexpected behaviour resulted.

micahwittman 2009-12-03 22:35:44

I edited it into my initial post because I didn't want a huge comment. :P.

Andrew 2009-12-03 22:37:39

Answer 2

A:

Python 3 is the future, but if you are trying to do something right now, I would suggest sticking with 2.x - It has tons more documentation and examples you can find on the web.

Was there any reason for using python 3?

zdav 2009-12-03 22:38:13

Not really. =/.

Andrew 2009-12-03 22:42:37

Answer 3

+2 A:

Because you're using Python 3.1, you need to use the new Python 3.1 APIs

Try

urllib.request.urlopen('http://www.python.org/')

Alternately, it looks like you're working from Python 2 examples. Write it in Python 2, then use the 2to3 tool to convert it. On Windows, 2to3.py is in \python31\tools\scripts. Can someone else point out where to find 2to3.py on other platforms?

Jason R. Coombs 2009-12-03 22:38:21

I'm on Windows. Anyways, thanks, it worked fine. (The page you linked me to looks very helpful, by the way. Thanks for that, especially.)

Andrew 2009-12-03 22:42:04

Answer 4

A:

Mechanize is a great package for "acting like a browser", if you want to handle cookie state, etc.

http://wwwsearch.sourceforge.net/mechanize/

Joe Koberg 2009-12-03 22:56:10

ansaurus

tags:

views:

answers:

Get webpage contents with Python?

related questions