tags:

views:

99

answers:

2

Setup: Contact database using 4 tables

  • Contacts
  • Cities
  • States
  • Zips

Structure:

CREATE TABLE `contacts` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `last` varchar(100) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `first` varchar(100) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `prefix` varchar(50) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `suffix` varchar(50) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `address` varchar(100) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `address_1` varchar(100) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `city_id` int(100) DEFAULT NULL,
  `state_id` int(20) DEFAULT NULL,
  `alt_address_1` varchar(255) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `alt_address_2` varchar(255) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `alt_city` varchar(100) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `alt_state` varchar(20) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `alt_zip` varchar(15) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `publish_name` varchar(255) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `salutation` varchar(255) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `mail_label` varchar(255) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `solicitor` varchar(100) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `is_volunteer` tinyint(1) DEFAULT NULL,
  `is_sponsor` tinyint(1) DEFAULT '0',
  `is_company` tinyint(1) DEFAULT '0',
  `is_foundation` tinyint(1) DEFAULT '0',
  `status` varchar(15) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `created_on` datetime NOT NULL,
  `created_by` varchar(30) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `modified_on` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `modified_by` varchar(100) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `agency_id` int(25) DEFAULT NULL,
  `primary_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `primary_id` (`primary_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3008 DEFAULT CHARSET=utf8

CREATE TABLE `cities` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `city` varchar(50) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `stateid` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `city` (`city`)
) ENGINE=InnoDB AUTO_INCREMENT=128 DEFAULT CHARSET=utf8

CREATE TABLE `states` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `abbreviation` varchar(2) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `state` varchar(20) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `state` (`state`),
  UNIQUE KEY `abbreviation` (`abbreviation`),
  KEY `id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=52 DEFAULT CHARSET=utf8

CREATE TABLE `zips` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `zip` varchar(10) CHARACTER SET latin1 COLLATE latin1_general_ci DEFAULT NULL,
  `cityid` int(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `zip` (`zip`)
) ENGINE=InnoDB AUTO_INCREMENT=128 DEFAULT CHARSET=utf8

I have filled the contacts with 111 contacts, the states are simply all of the states, and cities have corresponding id keys that relate to the state id, zip codes have a key to match to a city.

The query is to generate a list of people to match up to the proper fields. Here is the query.

SELECT concat(contacts.last,' ', contacts.first) as name
     , cities.city
     , zips.zip
  FROM contacts
  JOIN cities
    ON cities.id = contacts.city_id
  JOIN states ON states.id = contacts.state_id
  JOIN zips ON zips.cityid = cities.id

This query, return 338 rows, of a possible 11 contacts. There obvious duplicates. This happens when I join zip codes, which because they belong to more than 1 city, it gets matched for each city (I think thats what is happening). Anyone have an answer on how to properly join these tables?

Thank you. Rich

A: 

your tables a properly joined. i think the problem you have here is that you have normalized your data improperly and gone too far. just store the address as provided. don't try and split it in to table with numeric ID's.

for example, what possible benefit do you get by storing the state as a number in the contact table instead of just the state code? and the same question applies to city.

longneck
State, city, and zip code data can be used in a large variety of places. Table nomalization will not help here. Rethinking the surrogate keys would however.
Kevin Peno
+1  A: 

I believe that you should re-think the surrogate key usage on a lot of these tables and use natural keys where possible. Taking the state table for an example, in most cases it will be acceptable to simply use the state short (ie. TX vs. Texas) for data and display purposes. This means that if you removed the incrementing ID on the states table and used a natural key for each state, you would reduce the necessecity of a join in 90% of cases.

Then use state.abbriviation as the FK in tables that need to store state values. Extending this to zipcodes and cities, you can FK the state abbr to the city table, and make a compound FK form the cities table to the contacts table giving you a key for both the city and the state at the same time.

Example schema (excluded zipcodes table and shortened contacts table):

CREATE  TABLE IF NOT EXISTS `states` (
  `state_id` CHAR(2) NOT NULL ,
  `name` VARCHAR(45) NULL ,
  PRIMARY KEY (`state_id`) ,
  UNIQUE INDEX `state_name` (`name` ASC)
)
ENGINE = InnoDB;

CREATE  TABLE IF NOT EXISTS `cities` (
  `state_id` CHAR(2) NOT NULL ,
  `city_name` VARCHAR(255) NOT NULL ,
  PRIMARY KEY (`state_id`, `city_name`) ,
  INDEX `fk_city_state_id` (`state_id` ASC) ,
  CONSTRAINT `fk_city_state_id`
    FOREIGN KEY (`state_id` )
    REFERENCES `states` (`state_id` )
    ON DELETE NO ACTION
    ON UPDATE NO ACTION
)
ENGINE = InnoDB;

CREATE  TABLE IF NOT EXISTS `contacts` (
  `contacts_id` INT NOT NULL AUTO_INCREMENT ,
  `state` CHAR(2) NULL ,
  `city` VARCHAR(255) NULL ,
  PRIMARY KEY (`contacts_id`) ,
  INDEX `fk_contact_city` (`state` ASC, `city` ASC) ,
  INDEX `fk_contact_state` (`state` ASC) ,
  CONSTRAINT `fk_contact_city`
    FOREIGN KEY (`state` , `city` )
    REFERENCES `cities` (`state_id` , `city_name` )
    ON DELETE NO ACTION
    ON UPDATE NO ACTION,
  CONSTRAINT `fk_contact_state`
    FOREIGN KEY (`state` )
    REFERENCES `states` (`state_id` )
    ON DELETE NO ACTION
    ON UPDATE NO ACTION
)
ENGINE = InnoDB;

-- -----------------------------------------------------
-- Data for table `states`
-- -----------------------------------------------------
SET AUTOCOMMIT=0;
INSERT INTO `states` (`state_id`, `name`) VALUES ('TX', 'Texas');
INSERT INTO `states` (`state_id`, `name`) VALUES ('CA', 'California');
INSERT INTO `states` (`state_id`, `name`) VALUES ('OR', 'Oregon');
COMMIT;

-- -----------------------------------------------------
-- Data for table `cities`
-- -----------------------------------------------------
SET AUTOCOMMIT=0;
INSERT INTO `cities` (`state_id`, `city_name`) VALUES ('CA', 'modesto');
INSERT INTO `cities` (`state_id`, `city_name`) VALUES ('OR', 'protland');
INSERT INTO `cities` (`state_id`, `city_name`) VALUES ('TX', 'Dallas');
COMMIT;

Now your query is simplified except in the extreme case where you'd need a full state nomenclature:

SELECT
    concat(contacts.last,' ', contacts.first) as name,
    city,
    state,
    zip
FROM contacts
WHERE {INSERTWHERE}
Kevin Peno
Interesting - I had not realized foreign keys were supported in mysql - I always thought they didn't. Im going to try this solution out and see how it works.ThanksRich
Richard Testani
They are supported, provided you are using INNODB :)
Kevin Peno
Im working out this example, but with a few questions. Would you create a similar relationship with zips on city_name -> zipcode?Also, in the contacts table - to look up the full state name, I would continue to use joins. Correct?Thanks for you help on this.
Richard Testani
You would indeed need to do a join (or separate query) to get the FULL name of a state. For City -> ZipCode relationship, I'd probably put a FK in the city table for the zipcode (to associate it with a city), but not necessarily make it part of the PK and then put an FK in the contacts table pointing to the zipcodes table (this would make sure that you are entering valid zipcodes). If you need to validate City, state, and zip, perhaps just bring an FK from the cities table with all 3 fields. It really depends on your setup.
Kevin Peno