views:

242

answers:

1

I'm connecting to the Google Maps API from PHP to geocode some starting points for a rental station locator application.

Those starting points don't have to be exact addresses; city names are enough. Geocoding responses with an accuracy equal to or grater than 4 (city/locality level) are used as starting points, and the surrounding rental stations searched.

The application is supposed to work in Germany. When a locality name is ambiguous (i.e. there are more than one place of that name) I want to display a list of possibilities.

That is not a problem in general: If you make an ambiguous search, Google's XML output returns a list of <PlaceMark> elements instead of just one.

Obviously, I need to bias the geocoding towards Germany, so if somebody enters a postcode or the name of a locality that exists in other countries as well, only hits in Germany actually come up.

I thought I could achieve this by adding , de or , Deutschland to the search query. This works mostly fine, but produces bizarre and intolerable results in some cases.

There are, for example, 27 localities in Germany named Neustadt. (Wikipedia)

When I search for Neustadt alone:

http://maps.google.com/maps/geo/hl=de&amp;output=xml&amp;key=xyz&amp;q=Neustadt

I get at least six of them, which I could live with (it could be that the others are not incorporated, or parts of a different locality, or whatever).

When, however, I search for Neustadt, de, or Neustadt, Deutschland, or Neustadt, Germany, I get only one of the twenty-seven localities, for no apparent reason - it is not the biggest, nor is it the most accuracy accurate, nor does it have any other unique characteristics.

Does anybody know why this is, and what I can do about it?

I tried the region parameter but to no avail - when I do not use , de, postcodes (like 50825 will be resolved to their US counterparts, not the german ones.

My current workaround idea is to add the country name when the input is numeric only, and otherwise filter out only german results, but this sounds awfully kludgy. Does anybody know a better way?

+1  A: 

This is definitely not an exhaustive answer, but just a few notes:

  1. You are using the old V2 version of the Geocoding API, which Google have recently deprecated in favour of the new V3 API. Google suggests to use the new service from now on, and while I have no real experience with the new version, it seems that they have improved the service on various points, especially with the structure of the response. You do not need an API key to use the new service, and you simply need to use a slightly different URL:

    http://maps.google.com/maps/api/geocode/xml?address=Neustadt&amp;sensor=false

  2. You mentioned that you were filtering for placemarks on their accuracy property. Note that this field does not appear anymore in the results of the new Geocoding API, but in any case, I think it was still not very reliable in the old API.

  3. You may want to try to use the bounds and region parameters of the new API, but I still doubt that they will solve your problem.

I believe that the Google Geocoder is a great tool for when you give it a full address in at least a "street, locality, country" format, and it is also very reliable in other formats when it doesn't have to deal with any ambiguities (Geocoding "London, UK" always worked for me).

However, in your case, I would really consider pre-computing all the coordinates of each German locality and simply handle the geocoding yourself from within your database. I think this is quite feasible especially since your application is localized to just one country. Each town in the Wikipedia "List of German Towns" appears to have the coordinates stored inside a neat little <span> which looks very easy to parse:

<span class="geo">47.84556; 8.85167</span>

There are sixteen Neustadts in that list, which may be better than Google's six :)

Daniel Vassallo
+1 thanks for pointing out the API version quirks: Thats's the first thing I'm going to change. Pre-parsing a town name list is a very good idea, in fact that was the way it was done until now (with hard-coded positions instead of dynamic calculcations). I would like to get around the extra work to implement this but it may indeed be the best way to go. Cheers!
Pekka
@Pekka: Note that the Wikipedia list is just the first thing that came to mind. There may be other data sources out there, which may be much more convenient to use.
Daniel Vassallo