views:

54

answers:

2

I have a few tables representing geographical entities (CITIES, COUNTIES, ZIPCODES, STATES, COUNTRIES, etc).

I need to way represent sets of geographical entities. A set can contain records from more than one table. For example, one set may contain 3 records from CITIES, 1 record from COUNTIES and 4 from COUNTRIES.

Here are two possible solutions:

  • A table which contains three columns - one record for each entity. The table will contain multiple records for each set, all sharing the the set number.

set_id INT, foreign_table VARTEXT(255), foreign_id INT

Sample entries for set #5:

(5,'CITIES',4)

(5,'CITIES',12)

(5,'ZIPCODES',91)

(5,'ZIPCODES',92)

(5,'COUNTRIES',15)

  • A table which contains a TEXT column for each entity type, which will include a string set with the appropriate entries:

set_id INT,cities TEXT,counties TEXT,zipcodes TEXT,states TEXT,countries TEXT

So the above set will be represented with a single record

(5,'4,12','','91,92','','15')

Any other ideas? Would love to hear your input. Thanks!

+1  A: 

Both solutions you propose don't have real foreign keys. In the first solution, one foreign_id can point to many tables, which is hard (or at least inefficient) for a database to enforce. The second solution stores multiple values in one column, which is the one thing everyone agrees you shouldn't do (it breaks first normal form.)

What I would do is this: cities, zip codes, and states all "have a" geographical location. The normal way to implement that is a one to many relation. Create a geolocation table, and add a geolocation_id column to the cities, zip code, and state tables.

EDIT: Per your comment, to get from a geolocation to its cities:

select    *
from      geolocation g
left join cities c
on        g.id = c.geolocation_id
left join zipcodes z
on        g.id = z.geolocation_id
....

The database will resolve the joins using the foreign key index, which is very fast.

Andomar
Thanks! But now, given a geolocation_id, how do I trace it back to the correct table? One way will be to add a foreign_id/foreign_table entries to the geolocation table, but then we're back to square one. Or are you suggesting a query with multiple joins? Is that efficient? Thanks again!
bosh
@bosh: Right, you'd use a query with multiple joins. As long as you define foreign keys, that's a very efficient operation!
Andomar
+1  A: 
  • One Location Set can have many Geography items
  • One Geography item can belong to many Location Sets

Regarding the Geography item table, two approaches are possible. In the first case the super-type/subtype relationship is overlapping -- more than one sub-type can be linked to the super-type. For example, there can be GeographyID = 5 in Geography and Zipcodes, Cities, States, Countries tables.

alt text



In the second case, we can consider the exclusive (disjoint) relationship, in which only one subtype can be connected to the super-type. The parent-child relationship is used to create paths, like ZIP/City/State/Country -- that is if actual administrative areas allow for this type of relationship.

In this example, there can be GeographyID = 5 in the Geography and only one more sub-type table.

alt text

Damir Sudarevic