views:

50

answers:

3

Hi guys,

Let's say if I have a table "PersonData" in which 4 of its columns are

FromPersonId, FromEmployeeId, ToPersonId, ToEmployeeId

Any of the records only ever contains Only one From** and Only one To** and the other two columns are null. Also FromPersonId and ToPersonId belong to "Person" table and FromEmployeeId and ToEmployeeId belong to "Employee" table.

My question is: How do I PROPERLY join PersonData with Person and Employee table?

Please note I have tried bunch of different approaches but when I do a count on the PersonData I am getting different results when doing a join (i.e. it is showing more than what is in PersonData)...

How should I be doing a join? should I do one for each FromPersonId and ToPersonId to Person and similar for the Employee? (doubt this to be the case)...

Thanks,

Voodoo

+2  A: 

It looks like there are 4 possibilities:
FromPerson -> ToPerson
FromPerson -> ToEmployee
FromEmployee -> ToPerson
FromEmployee -> ToEmployee

So, I'd suggest joining in 4 queries and unioning the results. Something like (assuming you are after the from and to names for the purposes of the example):

SELECT from.name from_name, to.name to_name
FROM Person from, PersonData pd, Person to
WHERE from.Id = pd.FromPersonId
AND to.Id = pd.ToPersonId
UNION
SELECT from.name from_name, to.name to_name
FROM Person from, PersonData pd, Employee to
WHERE from.Id = pd.FromPersonId
AND to.Id = pd.ToEmployeeId
UNION
... (you get the picture)

John Pickup
My tables are greater than 25 columns and I need pretty much all of data in them, is there another approach other than union? Is there a way I can Distinctly join the Person and Employee to PersonData.
VoodooChild
+1  A: 

Ok, so it sounds like the problem is that if you join to Person on FromPersonID=PersonID and ToPersonId=PersonID, you get 2 records for each record in PersonData.

The way around this would be to join to 2 aliased copies of PersonData so you can attach the results into the same row. Something like:

Select * from PersonData
LEFT JOIN Person p1 on p1.PersonID = FromPersonID
LEFT JOIN Person p2 on p2.PersonID = ToPersonID

and basically do the same for the to and from employee fields

Rob Cooney
F$#@# Left join was the answer....You are awesome :)
VoodooChild
Also Rob, any good guides on the difference between different types of joins which will help me, something not hard to follow :) thanks again.
VoodooChild
np :) I think the main thing to know about joins is just that an inner join will only return results in both tables, where an outer join will return all rows from the dominant table and null values for fields that don't match a record in the other table. Inner joins are done by default if you don't specify. A left join is an outer join that just says the dominant table is the first one specified. Imagine you have the field values from PersonData sitting on the left, and then we just "bolt on" the p1 values, if they exist. Now that combination is the new left, and we "bolt on" any p2 values.
Rob Cooney
+1  A: 

Because you haven't posted you schema as SQL DDL, I'm having problems seeing how these tables can work in practise. Here's my attempt:

It would seem a fair assumption that an employee must be a person, so that's easy enough (guessing data types and domain rules):

CREATE TABLE NaturalPersons
(
 PersonId INTEGER NOT NULL UNIQUE
);

CREATE TABLE Employees
(
 PersonId INTEGER NOT NULL UNIQUE
    REFERENCES NaturalPersons (PersonId), 
 EmployeeID CHAR(3) NOT NULL UNIQUE
    CHECK (EmployeeID LIKE '[A-Z][0-9][0-9]')
);

The table name 'PersonData' doesn't reveal much but from the data element names it seems that something is being transferred from one person/employee to another:

CREATE TABLE Transfers
(
 FromPersonId INTEGER 
    REFERENCES NaturalPersons (PersonId), 
 FromEmployeeID CHAR(3)
    REFERENCES Employees (EmployeeID),
 ToPersonId INTEGER 
    REFERENCES NaturalPersons (PersonId), 
 ToEmployeeID CHAR(3)
    REFERENCES Employees (EmployeeID)
);

Hmm, all NULLable columns means we can't have a PRIMARY KEY but I wonder if there's a key at all...?

We only want one type of ID for 'from' and 'to' respectively:

ALTER TABLE Transfers ADD
 CONSTRAINT only_one_from_ID
    CHECK (
           (FromPersonId IS NULL AND FromEmployeeID IS NOT NULL)
           OR
           (FromPersonId IS NOT NULL AND FromEmployeeID IS NULL)
          );

ALTER TABLE Transfers ADD
 CONSTRAINT only_one_to_ID
    CHECK (
           (ToPersonId IS NULL AND ToEmployeeID IS NOT NULL)
           OR
           (ToPersonId IS NOT NULL AND ToEmployeeID IS NULL)
          );

We will also want a 'common sense' business rules to prevent transfers from and to the same person/employee:

ALTER TABLE Transfers ADD
 CONSTRAINT FromPersonId_cannot_be_ToPersonId
    CHECK (FromPersonId <> ToPersonId);

ALTER TABLE Transfers ADD
 CONSTRAINT FromEmployeeId_cannot_be_ToEmployeeId
    CHECK (FromEmployeeId <> ToEmployeeId);

That's about the best we can do, but we have a couple of problems:

INSERT INTO NaturalPersons (PersonId) VALUES (1), (2), (3), (4);
INSERT INTO Employees (PersonId, EmployeeID) VALUES (1, 'A11'), (2, 'B22');

-- transfer to same entity - oops!:
INSERT INTO Transfers (FromPersonId, ToEmployeeID) VALUES (1, 'A11'); 

-- Duplicate transfer - oops!:
INSERT INTO Transfers (FromEmployeeId, ToPersonID) VALUES (1, 'B1'); -- duplicate
INSERT INTO Transfers (FromPersonId, ToEmployeeID) VALUES ('A1', 2); -- duplicate

In other words, mixing PersonId and EmployeeID in the same table makes it hard to write basic data rules.

If I am correct in assuming an employee is a person, why not just use PersonID only?

If an employee is not a person, can you post your schema (data type, constraints, etc) please?

onedaywhen
Thanks and +1 for the great explanation which I hope to read up thoroughly later on. But my question was just as an example and not really involving Person or Employee - I needed to do something similar with some other old app data but was drawing a complete blank...
VoodooChild