hi, I have read a large deal of tutorials to help out and under Hpricot, the problem that i am finding out it is not scraping all the Html so to speak. I'll elaborate:
The website i am attempting to scrape html off is http://yellowpages.com.mt/Malta-Search/Radio-In-Malta-Gozo.aspx
.
I require to obtain the links that are listed as results ( i need to do this for possible any url on the aforementioned site and hence RSS or such is not beneficial as i need the program to read them off on-the-fly given any url i feed.)
I have tried everything to pull off the specific ID i require (giving in the direct XPATH so on an so forth) but i realised that when i do
doc = Hpricot(open("http://yellowpages.com.mt/Malta-Search/Radio-In-Malta-Gozo.aspx", 'User-Agent'=>'ruby')) str = doc puts str
the result provided excludes all the html related to the links i need! So which ever method i use to scrape, its not finding the elements required as they are not there according to hpricot.
When i view the Source code in Firefox , i do see them however so i'm very confused. Is there anyone who knows how to go around this issue pls? I have been trying to find my way for ages and i cant manage to find a solution alone! Any help would be highly appreciated