tags:

views:

279

answers:

6

I have a situation that's similar to what goes on in a job search engine where you type in the zipcode where you're searching for a job and the app returns jobs in that zipcode as well as in zipcodes that are 5, 10, 15, 20 or 25 miles from that zipcode, depending on preferences set by the user.

How would you calculate the neighboring locations for a zipcode?

+2  A: 

You need to get a list of zip codes with associated longitude / latitude coordinates. Google it - there are plenty of providers.

Then take a look at this question for an algorithm of how to calculate the distance

ChssPly76
A: 

I wouldn't calculate it, I would stored it as a fixed table in the database (only to change when the allocation of ZIP codes changes in a country). Make a relationship "is_neighbor_zip", which has pairs (smaller, larger). To determine whether two codes are neighboring, check in the table for specific pair. If you want all neighboring zips, it might be better to make the table symmetric.

Martin v. Löwis
Let's say there are 50K zipcodes right now (there are more). So you're looking at 50,000! rows in your join table which is somewhere around 10 to the power of 200,000: http://en.wikipedia.org/wiki/FactorialAre you absolutely sure that's a good approach?
ChssPly76
Why would there be 50000! rows (assuming you denote factorial with !). You only list a pair of zip codes if there are neighboring. If your country has 50000 ZIP codes (mine has only 17000), and assuming each area has 10 neighbors on average, the database would only have 500,000 entries. Assuming you can fit a zip code in four bytes, you need 2 MB of storage for the raw data.
Martin v. Löwis
"ZIP" refers to US postal codes, and there are more than 50,000 but that's not too important. When you say "neighboring", what exactly do you mean? OP wanted to find all postal codes within certain distance of current one. Depending on that distance, you can very much have postal codes that are not immediate neighbors. Your solution may work for small distances, but I would question its efficiency for distances above 100 miles.
ChssPly76
See the title of the OP's question. He is asking for neighboring zip codes. If the list of neighboring zipcodes can become long (determined by whatever metrics), the results of the job search also become long, so it is likely that there the application would only be interested in "true" neighboring areas (i.e. a small fraction, or even constant number of the total list of candidates). Even if you look for "unrestricted" neighborhood, there wouldn't be even close to 50000! entries - at worst 50000^2.
Martin v. Löwis
OP question's title is admittedly unclear; however he specifically says in the question that he's looking for postal codes "that are 5, 10, 15, 20 or 25 miles from that zipcode". 50,000 factorial figure is wrong, I misunderstood your suggestion. However, I still maintain that your approach will not work well for larger distances. For the record, I'm not the one who down-voted you; however as it's possible that the downvote was caused by my mistaken estimation I'm up-voting to compensate.
ChssPly76
+1  A: 

I don't know if you can count on geonames.org to be around for the life of your app but you could use a web service like theirs to avoid reinventing the wheel.

http://www.geonames.org/export/web-services.html

flo
A: 

You need to use a GIS database and ask it for ZIP codes that are nearby your current location.

You cannot simply take the ZIP code number and apply some mathematical calculations to find other nearby ZIP codes. ZIP codes are not as geographically scattered as area codes in the US, but they are not a coordinate system.

The only exception is that the ZIP+4 codes are sub-sections of the larger ZIP code. You can assume that any ZIP+4 codes that have the same ZIP code are close to each other.

I used to work on rationalizing the ZIP code handling at a company, here are some practical notes I made:

Testing ZIP codes

Hopefully has other useful info.

benc
A: 

Whenever you create a zipcode, geocode it (e.g. google geocoder api, saving the latitude and logitude) then google the haversine formular, this will calculate the distance (as the crow flies) from a reference point, which could also be geocoded if it is a town or zipcode.

To clarify some more:

When you are retrieving records based on their location, you need to compare each longitude and latitude DECIMAL with a reference point (your users geo-coded postcode or town name)

You can query:

SELECT * FROM photos p WHERE p.long < 60 AND p.long > 50 AND p.lat > -10 AND p.lat > 10

To find all UK photos etc because the uk is between 50 and 60 degrees longitude and +-10 latitude (i might have switched long with lat, i'm fuzzy on this)

If you want to find the distance then you will need to google the haversine formula and plug in your reference values.

Hope this clears things up a little bit more, leave a comment if you need details

Question Mark
A: 

I'm confused about this as well.

If you have 50000 zipcodes, you need to compare each zipcode against every other zipcode. Even with caching the results to avoid recalculating, this is an enormous number of calculations.

Unless I'm missing something, that is what would be required.

Am I wrong?