A web application I am working on has encountered an unexpected 'bug' - The database of the app has two tables (among many others) called 'States' and 'Cities'.
'States' table fields:
-------------------------------------------
idStates | State | Lat | Long
-------------------------------------------
'idStates' is an auto-incrementing primary key.
'Cities' table fields:
----------------------------------------------------------
idAreaCode | idStates | City | Lat | Long
----------------------------------------------------------
'idAreaCode' is a primary key consisting of country code + area code (e.g. 91422 where 91 is the country code for india and 422 is the area code of a city in India). 'idStates' is a foreign key derived from 'States' table to associate each city in the 'Cities' table with its corresponding State.
We figured that the country code + area code combination would be unique for each city, and thus could safely be used as a primary key. Everything was working. But a location in India found an unexpected 'flaw' in the db design - India, like the US is a federal democracy and is geographically divided into many states or union territories. Both the states and union territories data is stored in the 'States' table. There is, however, one location - Chandigarh - which belongs to TWO states (Haryana and Punjab) and is also a union territory by itself.
Obviously, the current db design doesn't allow us to store more than one record of the city 'Chandigarh'.
One of the solutions suggested is to create a primary key combining the columns 'idAreaCode' and 'idStates'.
I'd like to know if this is the best solution possible?
(FYI: we are using MySQL with the InnoDB engine).
More information:
- The database stores meteorological information for each city. Thus, the state and city are the starting point of each query.
- Fresh data for each city is inserted everyday using a CSV file. The CSV file includes an idStates (for state) and idAreaCode (for city) column which is used to identify each record.
- Database normalization is important to us.
Note: The reason for not using an auto incrementing primary key for the city table is that the database is updated everyday / hourly using a CSV file (which is generated by another app). And each record in the CSV file is identified by the idStates and idAreaCode column. Hence it is preferred that the primary key used in the city table is the same for every city, even if the table is deleted and refreshed again. Zip codes (or pin codes) and area codes (or STD codes) meet the criteria of being unique, static (don't change often) and a ready list of these are easily available. (We decided on area codes for now because India is in the process of updating its pin codes to a new format).
The solution we decided on was to handle this at the application level instead of making changes to the database design. In the database we will only be storing one record of 'Chandigarh'. In the application we've created a flag for any search for 'Chandigarh, Punjab' or 'Chandigarh, Haryana' to redirect search to this record. Yeah, it's not ideal, but an acceptable compromise since this is the ONLY exception we've come across so far.