views:

726

answers:

1

SpatialKey generates some really nice looking heatmaps, and we're looking into what's involved in doing this for an internal project to visualize large amounts of points. I'm looking for feedback on some ideas on where to get started (and it's just a really interesting problem).

SpatialKey heatmap

We know that they're using Flash, and from what we can tell, the heatmaps are interactive rather than being rendered from a tile server. Our first guess at how this is implemented is that the server provides their Flash client with a grid - each cell having a count computed by the server. The Flash client then does some interpolation based on the cell values in the grid to make the pretty output you see above.

At this stage, I'm just interested in how they could possibly generate the grid efficiently server-side (if our assumption on their implementation is even correct). It seems that it would involve:

  1. Performing a query for what's currently in map bounds
  2. Performing an aggregation subquery for each cell within those bounds (doing a count, sum, or average as in the example above).

Throw doing this at multiple zoom levels at a sane grid resolution and it seems like you'd need a custom spatial index to make this efficient.

Any takers on explaining an alternative route? If it matters, we're accustomed here to storing our data in PostgreSQL with PostGIS for the spatial index, but I'm open to trying anything.

+3  A: 

As just a guess, I would imagine they have implemented a GIS library in Flash in the client side and are using this to project latitude and longitude coordinates into a pixel space. Then they aggregate by pixel to determine the "height" of each pixel and render it just like you would render a circle, but using a gradient fill with a transparency, with the start and end colors of the gradient fill determined by the height of the pixel. Multiple circles overlaid on top of each other will create brighter pixels.

An alternative might be to do this in a greyscale, then map the brightness value to a color scale. That might be most efficient.

We sell the more traditional treemap heat maps for integration use in visual analytics applications (eg: heat map SDK), and now have geographic heat maps that colorize areas. We read standard ESRI Shapefile maps and do all the projection and rendering on the client-side (in Java, not Flash, but same concept). I think SpatialKey is doing the same, since they support area-filled rendering, which can't really be done if you are using a tile server like Google Maps.

We're not yet doing density heat maps like this, but have run a couple tests using static images as background. If you want more information, let me know and I can ask my developer how we did it. I know we're currently in development on more point-based features, though I don't know where density heat maps are on the schedule yet.

SpatialKey just actually wrote a good post on the different between area-filled heat maps (ie: thematic maps) and density heat maps. You can check it out at http://blog.spatialkey.com/2010/02/comparing-thematic-maps-with-density-heatmaps/.

If you do figure out a good way of doing density heat maps, I'd be interested in learning how you did it, as it would be a valuable addition to our visual analytics SDK. Best of luck.

Trevor Lohrbeer
I just realized I may have misinterpreted the question. The question appears directed at how to get the data set which contains the latitude, longitude and "height", rather than how to render this.Again, not knowing how SpatialKey is doing this, I think you've got it at least partially right. Rather than executing subqueries for each cell, which could quickly overwhelm the database (a 10x10 grid would require 100 subqueries), you could do the following: - Make the client side pass the width and height of the render surface along with the bounds in longitude and latitude
Trevor Lohrbeer
- Calculate the resolution of the longitude and latitude by doing a range mapping between longitude and latitude and width and height. This tells you the effective bin width and height for each cell from a latitude and longitude perspective - Query for all the points in the longitude and latitude bounds - Iterate through each point and round to the nearest longitude and latitude bin - Store the result in a hashtable lookup with the key being the binned longitude and latitude and the value being the count
Trevor Lohrbeer
- Output the result as a data set with three columns: longitude, latitude and count (eg: height)The client can then easily render this data set using a GIS library on the front-end. Or you could pre-project the points and send it to the front-end using X,Y pixel coordinates.[Note: I just realized my use of the term "height" here may be confusing. This is because a density map is essentially a colored tolopogical map with the color representing the height of each point.]
Trevor Lohrbeer
Final note: If you calculate your bin width and height to be a power of 10, you should be able to get Postgres to round the latitude and longitude to the right bin boundaries and then group by latitude and longitude so you get the proper data set result from a single query. This would be most efficient.
Trevor Lohrbeer
Your last comment finally hit me like a revelation - grouping by rounded lat/lng is brilliant in its elegance. That would make the aggregation complete cake for postgres. Thanks!
Brent Dillingham