I'm looking for a forgiving HTML parser for scraping HTML and extracting data in Ruby. I've had success using BeautifulSoup for this - what is the ruby equivalent?
+1
A:
There was a Rubyful Soup gem, which was a Ruby port of BeautifulSoup, but it's no longer maintained and their site now recommends hpricot.
Daniel Vandersluis
2010-09-15 21:50:41
+3
A:
Also see: Nokogiri vs Hpricot before making a choice. Nokogiri seems to outdo hpricot performance-wise (haven't benchmarked myself) and has a nice syntax IMO.
Uku Loskit
2010-09-15 21:51:00
Thank you. I used Nokogiri and it was sufficient for my purposes. I think the HTML I through at it was well-formed, so I have researched how fault tolerant it is.
Adam Loving
2010-09-16 17:34:46