views:

953

answers:

2

I found the HTML Agility Pack useful and easy to use for screen scraping web sites. What's the equivalent library for HTML screen scraping in Java, Ruby, Python?

+3  A: 

BeautifulSoup is the standard Python screen scraping tool.

Recently, however, I used the (incomplete at the moment) pyQuery, which is more or less a rewrite of jQuery into python, and found it to be very useful.

cobbal
lxml is good too.
Lennart Regebro
+2  A: 

Found what I was looking for: http://stackoverflow.com/questions/2861/options-for-html-scraping

Sajee