Hey guys.
I am working on some code that scrapes a page for two css classes on a page. I am simply using the Hpricot search method for this as so:
webpage.search("body").search("div.first_class | div.second_class")
...for each item found i create an object and put it into an array, this works great except for one thing.
The search will go through the entire html page and add an object into an array every time it comes across '.first_class' and then it will go through the document again looking for '.second_class', resulting in the final array containing all of the searched items in the incorrect order in the array, i.e all of the '.first_class' objects, followed by all the '.second_class' objects.
Is there a way i can get this to search the document in one go and add an object into the array each time it comes across one of the specified classes, giving me an array of items that is in the order they are come across in on the page i am scraping?
Any help much appreciated. Thanks