views:

394

answers:

6

I have a database of addresses, all geocoded.

What is the best way to find all addresses in our database within a certain radius of a given lat, lng?

In other words a user enters (lat, lng) of a location and we return all records from our database that are within 10, 20, 50 ... etc. miles of the given location.

It doesn't have to be very precise.

I'm using MySQL DB as the back end.

+2  A: 

This is a typical spatial search problem.

1> what db are you using, sql2008, oracle, ESRI geodatabase, and postgis are some spatial db engine which has this functionaliyt. 2> Otherwise, you probably look for some spatial Algo library if you want to achieve this. You could code for yourself, but I won't suggest because computation geometry is a complicated issue.

J.W.
+1  A: 

I don't remember the equation off the top of my head, but the Haversine formula is what is used to calculate distances between two points on the Earth. You may Google the equation and see if that gives you any ideas. Sorry, I know this isn't much help, but maybe it will give a place to start.

dsrekab
+2  A: 

If you're using a database which supports spatial types, you can build the query directly, and the database will handle it. PostgreSQL, Oracle, and the latest MS SQL all support this, as do some others.

If not, and precision isn't an issue, you can do a search in a box instead of by radius, as this will be very fast. Otherwise, things get complicated, as the actual conversion from lat-long -> distances needs to happen in a projected space (since the distances change in different areas of the planet), and life gets quite a bit nastier.

Reed Copsey
+3  A: 

You didn't mention your database but in SQL Server 2008 it is as easy as this when you use the geography data types

This will find all zipcodes within 20 miles from zipcode 10028

SELECT h.*
FROM zipcodes g
JOIN zipcodes h ON g.zipcode <> h.zipcode
AND g.zipcode = '10028'
AND h.zipcode <> '10028'
WHERE g.GeogCol1.STDistance(h.GeogCol1)<=(20 * 1609.344)

see also here SQL Server 2008 Proximity Search With The Geography Data Type

The SQL Server 2000 version is here: SQL Server Zipcode Latitude/Longitude proximity distance search

SQLMenace
Unfortunately, I'm using mysql db. =(
Nick
vote down: Database is MySql, I'd go with the MySql Geospatial extensions.Or do it in code with a GIS framework (dependant on the language: Topology Framework .NET or JTS Topology Suite for Java)
Bertvan
@Bertvan, OP didn't include database, but has since added that info
KM
+5  A: 

There are Spatial extensions available for MySQL 5 - an entry page to the documentation is here:

http://dev.mysql.com/doc/refman/5.0/en/spatial-extensions.html

There are lots of details of how to accomplish what you are asking, depending upon how your spatial data is represented in the DB.

Another option is to make a function for calculating the distance using the Haversine formula mentioned already. The math behind it can be found here:

www.movable-type.co.uk/scripts/latlong.html

Hopefully this helps.

I'd go with this option. Or do it in code with a GIS framework (dependant on the language: Topology Framework .NET or JTS Topology Suite for Java)
Bertvan
A: 

If it doesn't have to be very accurate, and I assume you have an x and y column in your table, then just select all rows in a big bounding rectangle, and use pythagorus (or Haversine) to trim off the results in the corners.

eg. select * from locations where (x between xpos-10 miles and xpos+10miles) and (y between xpos -10miles and ypos+10miles).

Remember pythagorus is sqrt(x_dist^2 + y_dist^2).

Its quick and simple, easy to understand and doesn't need funny joins.

gbjbaanb
There's no efficient way to do a bounding box query on separate x and y indexes. Any query plan will boil down to selecting on one, then filtering on the other, which will lead to horrible scaling problems (like selecting the entire west coast of the US just to get data for San Francisco).
Nick Johnson