views:

1484

answers:

12

I'm new to SQL and relational databases and I have what I would imagine is a common problem.

I'm making a website and when each user submits a post they have to provide a location in either a zip code or a City/State.

What is the best practice for handling this? Do I simply create a Zip Code and City and State table and query against them or are there ready made solutions for handling this?

I'm using SQL Server 2005 if it makes a difference.

I need to be able to retrieve a zip code given a city/state or I need to be able to spit out the city state given a zip code.

+7  A: 

You have a couple options. You can buy a bulk zip-code library from somebody which will list zip codes, cities, counties, etc. by state, or you can pay someone to access a web service which will perform the same function on a more granular level.

Your best bet would be to go with the zip-code library option, as it'll cost you less than the web service and will provide better performance. How you query or pre-process this library is up to you. You mention SQL Server, so you'd probably want State, Zipcode, and City tables, and include the relevant relationships between them. You'll also need to have provisions for cities that span multiple zipcodes, or for zipcodes that have multiple cities - but none of these issues are insurmountable.

As far as dealing with the vagarities of user input, you may consider enlisting the help of an address validation web service, although most of them require a full shipping address in order to validate.

Edit: looks like there's a SourceForge project offering free zip-code data, including lat/lon data, etc. Not sure how correct or current it is.

Edit 2: After some cursory looking on that SourceForge project's site it looks like this is a dead project. If you use this data, you'll need to provide some allowance for zipcodes / cities that don't exist in your database. Purchased bulk libraries usually come with some sort of guarantee of updates, or a pricing plan for updates, etc., and are probably more reliable.

Erik Forbes
It won't work this way, unfortunately.
alphadogg
Mind elaborating?
Erik Forbes
I didn't vote down, but see my own answer to this question. Unless the OP involves the user, there can be issues figuring out city from zip or vice versa. Otherwise, you basically said what I said...
alphadogg
I'd say in the cases where there's ambiguity, there could be a page asking for more details - via your example: "Your zip code is 10027 - did you mean New York City or Manhattan?" with a radio button next to each. In the common case, people won't see this page; I would think that a secondary page...
Erik Forbes
...in the exceptional case wouldn't be too great a burden for the users who encounter it.
Erik Forbes
I just recently finished building a system that relies on user-entered address data, so I know it can be done, even in the ambiguous cases - although in my case (weather API interop) it's not so important to distinguish between the NY boroughs, for example.
Erik Forbes
So I suppose a good counter-question would be - how important is it that you have perfectly accurate data?
Erik Forbes
Go through the checkout process at Apple.com -- they only ask for zip code data, and if there are ambiguities, they ask you to clarify. Good example with (I assume) lots of usability testing behind it.
Alex Czarto
A: 

My understanding is that the USPS web API is free to use, but requires permission that depends on a number of factors, including the nature of the program that will be using it and the reason you need the data.

If you qualify to use it, this would presumably be the most accurate source for the information you need.

Ben Dunlap
When signing up with USPS API, be sure to carefully consider how you answer application questions. They ask leading questions which if answered incorrectly will get your use request denied. Also they had serious downtime issues during the holidays this past year.
DavGarcia
+2  A: 

Have a ZipCode table that is related to a CityState table. Some zip codes have multiple cities associated with them, so you may need to have the interface let the user select from the city they want or let them override the default.

I use the paid service from ZipInfo.com since I needed additional information such as lat/long, zip type and county. Zip codes also change several times as year as new zip codes are added or merged with others, so you will need to update your data a few times a year to stay consistent.

DavGarcia
+1  A: 

Depending on the details of what you're building, Yelp has a free neighborhoods API that may be able to get you what you need. Be sure to check their Terms of Use and stuff to make sure you stay in compliance.

I know this isn't a db centric answer, but what you're doing may not be best handled in the database itself.

Bramha Ghosh
A: 

Not sure I understand the question. Do you need to allow either and, later, return both?

You'll have to be careful, even with a zip/city database that can be purchased, since some cities span multiple zips, so you can't always "calculate" in that direction. Similar issue in the opposite direction.

alphadogg
A user can provide either, and I need both.
Simucal
Well, then in many cases you won't be able to calculate the appropriate missing info unless maybe you involve street.
alphadogg
For example, New York City has six zip codes. If you take one of them, say 10027, that zip code is in two cities: "New York City" and "Manhattan". Not to mention that you user could supply "New York City" as "NYC", "NY City", "New York", etc...
alphadogg
A: 

There are libraries that you can buy and import that aren't that expensive. Your problem with them is going to be that you will have an ongoing maintenance cost with it (not a lot). Zip codes change all the time, meaning that in 6 mos or so, your data will be slightly out of date. You might want to look at interfacing with a service like google maps. Here we use a hybrid approach. We spend the amount of the money for updates to the data every 6 months, and if the zip code isn't in the table we verify it against google maps to make sure that information entered is still valid and it just hasn't been entered into our system. If it is valid, we update the system, if not we let the user know that a mistake was made.

Another option would be to ping the USPS website. I believe they have a city/state look up page by zipcode.

Kevin
A: 

I got that beat. Here's a free one (zipped, csv):

Hosted At maphacks.com

Headers are zip,city,state,latitude,longitude,timezone,dst

Now, your potential loss is that you aren't paying for updates (which can bite you in the end, but they try to keep it somewhat updated)

Watch out for free. Sometimes, they disappear. Or, updates aren't clean. I relied on a free one once... once. If your database is important, don't play with that data. GIGO.
alphadogg
A: 

There are many free geo-coding webservices where you can get a zip from a city-state, and vice-versa. Take a look at the GeoNames webservice. You could do something like check your db, and then if what you are looking for is not there, grab it from the webservice and add it.

Muad'Dib
A: 

If the reason for the lookup is the user's convenience, here's an alternate approach that doesn't require licensing any third-party databases:

Just lookup the city/state from your existing name/address table, provided the zip code matches. If somebody has previously made an entry for that zip code, then you'll find the city and state from that entry. IF no previous entry exists, then worst case the user has to entry in the city and state.

This solution assumes your need is for convenience for the user. If you are more concerned about accurate validation of city, state, zip codes then you're better off licensing a verified database.

Kluge
A: 

Free zip code database located at:

http://www.populardata.com

A: 

As previously mentioned, PopularData.com provides a free CSV of U.S. ZIP code data.

I generated a SQL script for populating a SQL Server 2005 table with the PopularData.com ZIP code data.

You can use this table to do ZIP-to-City/State or City/State-to-ZIP lookups.

The usual caveat applies here: you get what you pay for. If you need assurance that the data are up-to-date and valid, go with a commercial solution.

Jon Sagara
A: 

Cheap USPS data is at http://semaphorecorp.com

For the pitfalls of only validating city-ZIP correspondence without considering the address, see http://semaphorecorp.com/cgi/zip5.html

joe snyder