tags:

views:

52

answers:

2

I'm not that new in programming languages(python) but I got no clue on where will I start in making a bot or a scrapper using python?. should I study in cgi programming? or does the scrapper runs just using a python script? Should I build a server for that? Got no clue for this... thanks for the help

A: 

Screen scraping involves a lot of regular expressions to get the exact data you want. You also want to know what sort of data you want to analyze and how you want to store it.

To get the pages, you'll need to utilize libraries such as urllib (or urllib2) and regular expressions (re) or a good script to use is beautifulsoup to do your dirty work (http://www.crummy.com/software/BeautifulSoup/)

If you want to build a pure bot that does what the search engines do, you also have to build a smart enough bot to know that you don't keep pinging the same domain continuously (results in a DOS attack).

Duniyadnd