tags:

views:

61

answers:

2

I would like to fetch certain .html files from a web server. My intention is to fetch .html files from a web site (http://www.thetabworld.com/) that has a word "metallica" on file name. How is that possible using python? I have heard about urllib2 but as a python noob, I don't have a slightest idea how to use it.

+1  A: 

You need to use urllib2 together with a HTML parser such as lxml or BeautifulSoup in order to extract the links from the retrieved pages in order to crawl the site.

Ignacio Vazquez-Abrams
+1  A: 

"I have heard about urllib2 but as a python noob, I don't have a slightest idea how to use it."

well if you don't know how to use urllib2, reading some docs would be a good start.

the following are excellent resources (with examples):

official python docs for urllib2
urllib2 - the missing manual
urllib2 cookbook
PMOTW - urllib2

Corey Goldberg
RTFM is not a very helpful response
Steve McLeod
steve, my answer gave 4 useful links to the best resources on urrlib2.. and was accepted by the OP. so, i would call it a "helpful response".
Corey Goldberg