The solution for distance between latitude/longitude coordinates is called the haversine formula. It's complex because you need to take the curvature of the earth into account unless your distances are very short.
Here's an article about using PHP and MySQL to implement a locator app: Creating a Store Locator with PHP, MySQL & Google Maps
You can also find many other questions here on Stack Overflow regarding calculating distance between coordinates: http://stackoverflow.com/search?q=longitude+distance
If you only need to calculate distance within 5km, the curvature of the earth is probably not significant. You can use a plain distance formula. You can even skip the square-root calculation if you only need to use this value to sort which one is closer.
SELECT s.location_id,
(s.lat-p.lat)*(s.lat-p.lat) + (s.long-p.long)*(s.long-p.long) AS distance_squared
FROM Locations s, Locations p
WHERE p.location_id = ?
ORDER BY distance_squared;
The table ('locations') I am selecting from has lat/long coordinates (lets call them $plat and $plong).
No, those are PHP variables. You need to calculate the distance from the lat/long coordinates on each row of your database table.
The problem is that SQL normally only calculates things from one row at a time. So you need to have some way of combining two rows into one row, so a calculation can use the coordinates of two locations. You do this with a self-join. This is basically pairing each row with another row from the same table. And that's why I list Locations
twice, and give them two different aliases, s
and p
(the technical term is correlation name).
If you're accustomed to PHP, think of this self-join as analogous to a nested loop:
foreach ($locations as &$s) {
foreach ($locations as &$p) {
// calculate the distance between $s and $p
}
}
The WHERE
clause restricts the rows of p
to just the place you start from (you would substitute a single value for the ? placeholder), so it's just one row that the query pairs with all the rows of the table.
Another tip: I skipped using SQRT()
because it's not necessary just to sort by distance. That is, if three locations are 10km, 15km, and 20km from me, then I can sort by 100, 225, and 400 and get the same ordering as if I sort by 10, 15, and 20. The advantage is that I've eliminated calculating the square root which reduces the cost of my query somewhat.