views:

268

answers:

2

I am creating a databse containing the names and coordinates of all bus stops in my local area. I have all the names stored in my database, and now I need to add the coordinates. I am trying to get these of a website that contains them all as placemarks on a Google Map. It seems to me like they are being generated from a local server, and then added to the map. However I am unable to find exactly where the server is queried for the coordinates.

I am hoping to collect these coordinates through the use of a screen scraper. However unless I am able to find where in the source code the coordinates are created this seems to be impossible. I can of course search and collect all these placemarks manually, but that will be very time consuming. So I am hoping that someone in here can help me.

This is the website I am trying to scrape. The placemarks are marked by the blue bus sign:

http://reiseplanlegger.skyss.no/scripts/travelmagic/TravelMagicWE.dll/?from=Brimnes+ferjekai+%28Eidfjord%29&to=

You can also get the coordinates of a single placemark by writing the name of the stop in the search field and pushing the "Vis i kart" button.

I hope someone can help me with this.

A: 

examine the source codes of this page. Google map comes blank from google. Then markers are added by code. Most probably coordinates are hardcoded in page or referenced JS. Or may be page requests them via Ajax. Again, you will see it in source code.

Andrey
+1  A: 

On checking with Firebug, it appears that the site you mentioned is getting the data in XML format with simple AJAX requests such as:

http://reiseplanlegger.skyss.no/scripts/travelmagic/TravelMagicWE.dll/mapxml?x1=4.85321044921875&x2=5.8282470703125&y1=60.150391714056326&y2=60.524184817591276&loc=1

The (x1, y1), (x2, y2) parameters define the (longitude, latitude) parameters of the viewport. Everytime the map is dragged, a new AJAX request is issued, which returns fresh data.

This is a sample response from the AJAX request:

<stages>
  <i n="Arna Stasjon Togstopp (Bergen)" sn="" v="12019888" t="2" i="0" x="5,465809" y="60,420116" sp="" st="Tog.GIF"/>
  <i n="Arna Terminal (Bergen)" sn="" v="12014200" t="2" i="0" x="5,464333" y="60,420319" sp="" st="Buss.GIF"/>
  <i n="Bjørkheim Ved Senter (Samnanger)" sn="" v="12426607" t="2" i="0" x="5,730484" y="60,402178" sp="" st="Buss.GIF"/>
  <i n="Bjørkheim Ved Senter (Samnanger)" sn="" v="12426608" t="2" i="0" x="5,731842" y="60,401312" sp="" st="Buss.GIF"/>
  <i n="Breistein Ferjekai (Bergen)" sn="" v="12017399" t="2" i="0" x="5,399175" y="60,490519" sp="" st="Ferge.GIF"/>
  <i n="Eikelandsosen Terminal (Fusa)" sn="" v="12410510" t="2" i="0" x="5,747773" y="60,241479" sp="" st="Buss.GIF"/>
</stages>

Note that the x attribute is defining the longitude, while the y attribute is defining the latitude.


Apart from the technical answer, I would suggest seeking permission before scraping such data.

Daniel Vassallo
Thank you so much. A very helpful answer. This is why I love stackoverflow. As for the permission part. This site is owned by the local county government, and paid for by the general public. hence the information is considered public information. I did however try to contact them and ask for a copy of their database, and they did not event bother to answer.
App_beginner