The closest thing there is to an industry standard is this: each dependent table is linked by a foreign key to its immediate parent:
create table country
(country_id number not null
, country_name varchar2(30)
, constraint country_pk primary key (country_id)
)
/
create table state
(state_id number not null
, state_name varchar2(30)
, country_id number not null
, constraint state_pk primary key (state_id)
, constraint state_country_fk foreign key (country_id)
references country(country_id)
)
/
create table city
(city_id number not null
, city_name varchar2(30)
, state_id number not null
, constraint city_pk primary key (city_id)
, constraint city_state_fk foreign key (state_id)
references state(state_id)
)
/
create table neighbourhood
(neighbourhood_id number not null
, neighbourhood_name varchar2(30)
, city_id number not null
, constraint neighbourhood_pk primary key (neighbourhood_id)
, constraint neighbourhood_city_fk foreign key (city_id)
references city(city_id)
)
/
An alternative approach, which has largely fallen out of favour, is to define the primary keys of the child tables as compound keys including the keys of the immediate parent table:
create table state
(country_id number not null
, state_id number not null
, state_name varchar2(30)
, constraint state_pk primary key (country_id, state_id)
, constraint state_country_fk foreign key (country_id)
references country(country_id)
)
/
create table city
(country_id number not null
, state_id number not null
, city_id number not null
, city_name varchar2(30)
, constraint city_pk primary key (country_id, state_id, city_id)
, constraint city_state_fk foreign key (country_id, state_id)
references state(country_id, state_id)
)
/
create table neighbourhood
(country_id number not null
, state_id number not null
, city_id number not null
, neighbourhood_id number not null
, neighbourhood_name varchar2(30)
, constraint neighbourhood_pk primary key (country_id, state_id, city_id, neighbourhood_id)
, constraint neighbourhood_city_fk foreign key (country_id, state_id, city_id)
references city(country_id, state_id, city_id)
)
/
This approach is deprecated because in the short term it creates exceedingly unwieldy joins and in the long term it creates horrible messes when the keys change. Primary keys are not supposed to change, but compounding them creates meaning. Consequently, when the system's data changes - say there's a state boundary re-organisation - changes to a whole bunch of cities have to be cascaded to the Neighbourhood table, and any other children. Yuck.
Your proposal is an alternate version of this:
create table state
(state_id number not null
, state_name varchar2(30)
, country_id number not null
, constraint state_pk primary key (state_id)
, constraint state_country_fk foreign key (country_id)
references country(country_id)
)
/
create table city
(city_id number not null
, city_name varchar2(30)
, country_id number not null
, state_id number not null
, constraint city_pk primary key (city_id)
, constraint city_country_fk foreign key (country_id)
references country(country_id)
, constraint city_state_fk foreign key (state_id)
references state(state_id)
)
/
create table neighbourhood
(neighbourhood_id number not null
, neighbourhood_name varchar2(30)
, country_id number not null
, state_id number not null
, city_id number not null
, constraint neighbourhood_pk primary key (neighbourhood_id)
, constraint neighbourhood_country_fk foreign key (country_id)
references country(country_id)
, constraint neighbourhood_state_fk foreign key (state_id)
references state(state_id)
, constraint neighbourhood_city_fk foreign key (city_id)
references city(city_id)
)
/
It avoids the compound keys but you still have that cascading data problem. Also it violates relational practice by introducing foreign keys for relationships which don't exist (there is no direct relationship between Neighbourhood and Country, it is implied through the intermediate linkages).
On the plus side, as you point out, this can be very helpful for running queries which want to return Neighbourhoods for a given Country. I have worked on one system where this was useful (actually it employed inherited compound keys, but the principle is the same). However, this was a very specialist data warehouse and even then the queries I ran were admin/developer queries rather than application ones. Unless you are dealing with huge amounts of data (millions of neighbourhoods) I think the performance gain from skipping a couple of joins would not be worth the overhead of managing those additional columns.
In short, use the first approach: it's neat and it's standard.
edit
"State should be optional though since
not all countries have a state. Then a
Country will connect with city
directly."
If true, that changes everything. Obviously STATE cannot be used as an identifying foreign key for CITY. So CITY must reference COUNTRY instead. STATE can be an optional lookup on CITY.
Although I think most countries do have some equivalent sub-division such as counties or departments. Even the microstates like Lichtenstein and San Marino have municipalities (Monaco has just the one). Perhaps the only nation which doesn't is Vatican City. So consider carefully whether to structure your data model to support one or two edge cases, or to munge the data by injecting an artificial "State" for exceptions such as the Holy See. Neither approach is entirely satisfactory.
"all these fields will be
auto-complete fields so not sure if
that alters the table structure in
anyway?"
Makes no difference.
"But who knows, few months later we
may discover some cool feature that
may need country to match with
neighbourhoods too."
Yes but then again you may not. XP has a powerful principle called YAGNI - You're aren't gonna need it. Basically, don't do a lot of work and complicate your design for the sake of some putative future requirement which might never arrive.
And if it does arrive then the first solution would be to join NEIGHBOURHOOD and COUNTRY through the intermediate tables (or table, if you don't use STATE as the referenced for CITY). Only if the performance of that query is Teh Suck! and it it stubbornly resists tuning should you consider tweaking the data model.