Hi, I am having a problem Scraping Code i require to extract information for a Web MashUp i'm creating.
Basically, I am trying to Scrap Code from:
http://yellowpages.com.mt/Meranti-Ltd-In-Malta-Gozo;/Hair-Accessories;Hijjhkikke=Hiojhhfokje.aspx
This is just one of the pages i will need to scrape and hence i cannot feed the program directly the code i need =/.
When i Scrape the Page using the following code (in Hpricot)
puts open(ypUrl, 'User-Agent'=>'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2') { |f| Hpricot(f) }
I am noticing that instead of the part of code i require, i am only seeing the script reference, namely
<script type="text/javascript" src="http://maps.google.com/maps?file=api&amp;v=2&amp;sensor=false&amp;key=ABQIAAAA8JYIIyGmC1BLOU85GKKkPRSNQenRT-s-Gs-9sYb3ZSBhRRTdcRTMq3zWEID1E35uXl9bdQKIPQIjNQ"></script><title>
Beautimport Ltd (Balmain Hair Extensions) in Malta | Yellow Pages?? (Malta) Ltd | YellowPages.com.mt
This is also what i see when i do view source on Firefox. However when i hover over the elements in Firebug, I am able to get an XPath, which unfortunately is not working due to the script reference remaining such. (i'm not sure if i'm explaining is correct). I would really require all the code that is generated on the page due to the script (which is far only viewable in firebug). I would need this so that i can extract the following (taken from firebug by hovering over the Google Icon on the map:
<a title="Click to see this area on Google Maps" href="http://maps.google.com/maps?ll=35.88805,14.46627&spn=0.006988,0.015922&z=16&key=ABQIAAAA8JYIIyGmC1BLOU85GKKkPRSNQenRT-s-Gs-9sYb3ZSBhRRTdcRTMq3zWEID1E35uXl9bdQKIPQIjNQ&sensor=false&mapclient=jsapi&oi=map_misc&ct=api_logo" target="_blank">
which gives the following Xpath (//denotes a tbody), but as i mentioned, as it is not giving the entire code in Hpricot, its pretty useless as it can't get to it!
/html/body/form/table//tr/td/div/table[2]//tr[2]/td[2]/div/div[2]/table//tr/td/div/div[2]/a
In this manner i would be able to extract the Lng and Lat which i really require for my project. I really dont know how to go about this in another manner using Hpricot as its not giving me all the code i require. Any Help would be extremely appreciate.