Background
To replace invalid zip codes.
Sample Data
Consider the following data set:
Typo | City | ST | Zip5
-------+------------+----+------
33967 | Fort Myers | FL | 33902
33967 | Fort Myers | FL | 33965
33967 | Fort Myers | FL | 33911
33967 | Fort Myers | FL | 33901
33967 | Fort Myers | FL | 33907
33967 | Fort Myers | FL | 33994
34115 |Marco Island| FL | 34145
34115 |Marco Island| FL | 34146
86405 | Kingman | FL | 86404
86405 | Kingman | FL | 86406
33967 closely matches 33965, although 33907 could also be correct. (In this case, 33967 is a valid zip code, but not in our zip code database.)
34115 closely matches is 34145 (off by one digit, with a difference of 3 for that digit).
86405 closely matches both.
Sometimes digits are simply reversed (e.g,. 89 instead of 98).
Question
How would you write a SQL statement that finds the "minimum distance" between multiple numbers that have the same number of digits, returning at most one result no matter what?
Ideas
- Subtract the digits.
- Use
LIMIT 1
.
Conditions
PostgreSQL 8.3