Hi!
I already searched a long time for a good solution, but I can't find anything that fits my needs...
I want to parse an HTML file and display its content in a table. Everything is almost like writing yet another RSS feed reader. Doing that by parsing valid XML files is simple and straight forward using NSXMLParser or TouchXML or libxml directly or some other XML parseres out there... But these frameworks either only work with XML and/or are not working with non-tidy HTML. The site consists of divs including links that include images or paragraphs including links and images etc. etc... just a normal website. Using libxml seems way too complicated in that case.
Does somebody have more experience with parsing dirty HTML pages? Which (free) library/framework did you use? I have the feeling that I just miss something obvious here. It can't be that difficult to parse HTML files, or not?
I hope you can point me to the right direction!