Hi
I would like to parse a html page and extract the meaningful text from it. Anyone knows some good algorithms to do this?
I develop my applications on Rails, but I think ruby is a bit slow in this, so I think if exists some good library in c for this it would be appropriate.
Thanks!!
PD: Please do not recommend anything with java
UPDATE: I found this link text
Sadly, is in python