views:

696

answers:

6

What is the most efficient way of clustering coordinates on the server side? Examples?

Here is an example of exactly what I need (although I can't use third party vendors)

http://www.crunchpanorama.com/

PostgreSQL, Python platform, Google Maps.

+1  A: 

Have look at using a quadtree.

  1. Put all items into a quadtree(or simple grid).
  2. Then for each lowest level box.
    • Get all items that are in the box and the boxes on each side.
    • Then work out the distance between all the items you get got.

(You are using the quadtree or grid to reduce your search space. Otherwise your are O(n^2) on the number of items you have)

Ian Ringrose
Do you have any examples that use quadtree?
Eeyore
Sorry I don't have any examples however there must be some on the net or in a good GIS book
Ian Ringrose
A: 

I have posted this link a couple times before. It's written for the Static Maps API in PHP, but you can port it to Python. It should get you started on the clustering algorithm.

I have a Google Maps application which sends a request to the server containing the map bounds and zoom level whenever the map is moved. The server gets all the markers within the bounds, clusters them, and returns a list of markers to be added to the map.

UPDATE: 30k markers would probably be too many to cluster every time someone moves the map. However, if the users have no control of which markers are visible, you have a solution: pre-cluster the markers. I believe that this is how sites like the one you posted work. Every time you update your set of markers, run the clusterer and save the results for each zoom level. Use your set of pre-clustered markers to decide which markers to send to the browser when the user moves the map.

Chris B
I saw that example - it will be terribly slow when you have to do clustering on the entire map - not just on a region.
Eeyore
How many markers will you have? And will the markers be the same for everyone, or will the users be able to change which markers are visible?
Chris B
I have about 30k markers and the number will keep growing. Users will have no control of which markers are visible.
Eeyore
If users don't control the markers, you can pre-cluster them - users would never have to wait for clustering.
Chris B
Agree with Chris, for non user specific data, pre cluster the first 5/6 zoom levels and you'll be fine. In my experience the clustering itself is not too bad once you have the data in memory, the io (database/file/API) is where the time goes.
webclimber
A: 

I think the site you post as example is using client side clustering. Example of that you can find on http://gmaps-utility-library.googlecode.com/svn/trunk/markerclusterer/1.0/examples/

tdelev
CrunchPanorama most definitely uses server-side clustering. However, the MarkerClusterer (client-side) solution is great if you don't have more than a few thousand markers.
Chris B
A: 

look at http://www.geocubes.com they provide a solution for clustering a high amount of markers

A: 

I use server side clustering on my site (Canadian real estate and rentals); but I modify my search depending on the zoom. When I am zoomed into the city level, I limit the search to return 1000 results, and if some of the results are overlapping, I cluster them.

At higher zoom levels, I show summary details for the city, province or country.

For the actual clustering algorithm, I am using Django and Python and the source can be found here.

Tristen