views:

731

answers:

5

I love the Beautiful Soup scraping library in Python. It just works. Is there a close equivalent in Ruby?

+1  A: 

Hpricot? I don't know what others are using...

Oli
A: 

Google is your friend: http://www.crummy.com/software/RubyfulSoup/

thebigjc
Blindly posting google hits is not the best way to answer a question. If you go to your posted link you get told to use Hpricot instead.
jjnguy
+3  A: 

There's scRUBYt!, Rubyful-soup (no longer maintained), WWW::Mechanize, scrAPI and a few more.

Or you could just use Hpricot or Nokogiri for parsing.

SimonV
+2  A: 

Nokogiri is another HTML/XML parser. It's faster than hpricot according to these benchmarks. Nokogiri uses libxml2 and is a drop in replacement for hpricot. It also has css3 selector support which is pretty nice.

Edit: There's a new benchmark comparing nokogiri, libxml-ruby, hpricot and rexml here.

Ruby Toolbox has a category on HTML parsers here.

Jack Chu
+1  A: 

This image from Ruby Toolbox indicates the relative popularity of various parsers:

alt text

ski