views:

70

answers:

2

Hi, I'm trying to compute clusters on a set of points in Python, using GeoDjango.

The problem: Given a set of points, output a set of clusters of those points. (i'm fine specifying # of clusters/cluster size/distance in advance to simplify)

There are a few solutions on the web to do clustering, so it's a well known problem. I thought that GeoDjango would handle these types of problems out of the box, but it's not clear how - I've searched the GeoDjango documentation, Google, and a few other places, but couldn't find anything.

Before I roll my own clustering solution, I thought I'd ask to see if there's a straightforward way to do this using GEOS or another package within GeoDjango.

+1  A: 

GeoDjango does not have any built in clustering support; this operation is not typically provided by any existing Open Source GIS application that you would be using with GeoDjango that I'm aware of.

Several sites running Django/GeoDjango (like everyblock.com) have published what their method is for clustering, but this support is not built into GeoDjango.

In general, the functionality provided by these applications is based on the underlying database support. GEOS, the library underneath PostGIS, and the general 'state of the art' (at least in the non-Java world), does not have any kind of clustering API or behavior.

Christopher Schmidt
Thanks for the info. :)
vaughnkoch
A: 

As Christopher Schmidt mentioned, there doesn't seem to be any out of the box support for clustering in GeoDjango. However, if someone else runs into this issue, here's what I did:

  • Installed mlpy and numpy
  • Used the HCluster hierarchical clustering algorithm
  • Wrote a wrapper function to convert the GEOS Point objects into a matrix that mlpy could understand

Documentation at: https://mlpy.fbk.eu/data/doc/clustering.html

vaughnkoch