views:

83

answers:

2

I'm looking for a forgiving HTML parser for scraping HTML and extracting data in Ruby. I've had success using BeautifulSoup for this - what is the ruby equivalent?

+1  A: 

There was a Rubyful Soup gem, which was a Ruby port of BeautifulSoup, but it's no longer maintained and their site now recommends hpricot.

Daniel Vandersluis
+3  A: 

Nokogiri

Also see: Nokogiri vs Hpricot before making a choice. Nokogiri seems to outdo hpricot performance-wise (haven't benchmarked myself) and has a nice syntax IMO.

Uku Loskit
Thank you. I used Nokogiri and it was sufficient for my purposes. I think the HTML I through at it was well-formed, so I have researched how fault tolerant it is.
Adam Loving