views:

172

answers:

11

Here's a tricky normalization/SQL/Database Design question that has been puzzling us. I hope I can state it correctly.

You have a set of activities. They are things that need to be done -- a glorified TODO list. Any given activity can be assigned to an employee.

Every activity also has an enitity for whom the activity is to be performed. Those activities are either a Contact (person) or a Customer (business). Each activity will then have either a Contact or a Customer for whom the activity will be done. For instance, the activity might be "Send a thank you card to Spacely Sprockets (a customer)" or "Send marketing literature to Tony Almeida (a Contact)".

From that structure, we then need to be able to query to find all the activities a given employee has to do, listing them in a single relation that would be something like this in it simplest form:

-----------------------------------------------------
| Activity | Description    | Recipient of Activity |
-----------------------------------------------------

The idea here is to avoid having two columns for Contact and Customer with one of them null.

I hope I've described this correctly, as this isn't as obvious as it might seem at first glance.

So the question is: What is the "right" design for the database and how would you query it to get the information asked for?

+1  A: 

You would have something like follows:

Activity | Description | Recipient Type

Where Recipient Type is one of Contact or Customer

You would then execute a SQL select statement as follows:
Select * from table where Recipient_Type = 'Contact';

I realize there needs to be more information.

We will need an additional table that is representative of Recipients(Contacts and Customers):

This table should look as follows:

ID | Name| Recipient Type

Recipient Type will be a key reference to the table initially mentioned earlier in this post. Of course there will need to be work done to handle cascades across these tables, mostly on updates and deletes. So to quickly recap:

Recipients.Recipient_Type is a FK to Table.Recipient_Type

Woot4Moo
Where does it say who the recipient is in this model? How will you manage Foreign Keys to `Contacts`/`Customers`? Will you make a distinction between these?
Martin Smith
Not quite sure I follow your question Martin, I see now
Woot4Moo
This should be more flowing and answer the question. Let me know if it needs more explanation
Woot4Moo
+4  A: 

The "right" design for this database is to have one column for each, which you say you are trying to avoid. This allows for a proper foreign key relationship to be defined between those two columns and their respective tables. Using the same column for a key that refers to two different tables will make queries ugly and you can't enforce referential integrity.

Activities table should have foreign keys ContactID, CustomerID

To show activities for employee:

SELECT ActivityName, ActivityDescription, CASE WHEN a.ContactID IS NOT NULL THEN cn.ContactName ELSE cu.CustomerName END AS Recipient
FROM activity a
LEFT JOIN contacts cn ON a.ContactID=cn.ContactID
LEFT JOIN customers cu ON a.CustomerID=cu.CustomerID
jamietre
I think there is a fundamental misunderstanding in what a foreign key is in regards to a relational database on your side.
Woot4Moo
Not sure what you mean. I may have made an assumption about what OP was proposing.
jamietre
The problem I make note of is the same as in the post with the nice diagram. How does this not become a maintenance issue in the future as more types of Recipients are added. The most proper way, see maintainable, is to have a Recipients table that allows for the additiion/modification of the Recipient_Type column that does not drastically damage the psyche of anyone maintaining the database.
Woot4Moo
@Woot4Moo it could be argued that 1) YAGNI 2) your apps need to support it so there's maintenance with your solution
Conrad Frix
If it is expected that there could be more, vastly different kinds of recipients that all need to be linked this way, then the question becomes entirely different.
jamietre
@jami Since I am assuming the OP is still in the design phase he needs to consider this. @Conrad I don't think this is a matter of yagni, I think this is more a matter of an architecture question that could cause very large problems down the road. Especially if they start sub-typing customer and contact types.
Woot4Moo
@Woot4Moo we are now getting to a more basic question of good entity design. That's fine, I was just trying to address the rather pure question of "what's the best way to refer to a child in one of two tables" which comes up sometimes. To generalize his problem, we'd need to ask, what's the relationship between "contact" and "customer?" Is a "contact" always a child of "customer?" Why not just design Activity to always refer to a "contact" (which is subordinate to "customer" already) and add a flag to indicate it's actually directed at the entire organization that "contact" belongs to?
jamietre
@jami I think in the OP's question a Contact is a person and a Customer is a business. Additionally I don't think these two are necessarily in a master/slave relationship. However, I will wait for the OP to reply with more information
Woot4Moo
+8  A: 

It sounds like a basic many-to-many relationship and I'd model it as such.

alt text

Joe Stefanelli
How does this scale as more Types of Recipients are added, in terms of maintenance?
Woot4Moo
I like this in many ways, but it adds two extra tables... and extra functionality. Unless one needs the ability to have multiple "recipients" for a single activity I think having two foreign keys on the Activity table would be simpler.
jamietre
@Woot4Moo: Obviously it scales by requiring additional tables, but it avoids the cascading issues mentioned in your solution.
Joe Stefanelli
@jamietre: The OP's requirements specifically stated: "...avoid having two columns for Contact and Customer with one of them null."
Joe Stefanelli
Fair enough... I would fail the mensa question, but I think OP may be trying too hard to avoid that. I can't think of any advantage to this solution versus having a column that's null sometimes unless you need the extra functionality. Getting the output he wants is logically the same either way and there's certainly no advantage from the p.o.v. of space use.
jamietre
@Joe I asked scaling in terms of maintenance, not scaling in general.
Woot4Moo
If you have the requirement that an activity can only have one recipient, then you need to have additional code to check that the activityID doesn't exist in any of the other mapping tables when you insert it into one of them. The safest way to do that is probably a trigger on each mapping table right? It's just a little extra work when you add a new recipient type, but not a huge deal.
Mike Forman
A: 
    [ActivityRecipientRecipientType]
    ActivityId
    RecipientId
    RecipientTypeCode
        |||   |||  |||_____________________________    
         |     |                                  |
         |     --------------------               |
         |                        |               |
    [Activity]                [Recipient]      [RecipientType]
    ActivityId                RecipientId      RecipientTypeCode
    ActivityDescription       RecipientName    RecipeintTypeName


    select 
      [Activity].ActivityDescription  
    , [Recipient].RecipientName
    from
      [Activity] 
  join [ActivityRecipientRecipientType] on [Activity].ActivityId = [ActivityRecipientRecipientType].ActivityId
  join [Recipient] on [ActivityRecipientRecipientType].RecipientId = [Recipient].RecipientId
  join [RecipientType] on [ActivityRecipientRecipientType].RecipientTypeCode = [RecipientType].RecipientTypeCode
  where [RecipientType].RecipientTypeName = 'Contact'
amelvin
Care to explain the need for multiple joins and how this doesn't become a DBA's nightmare?
Woot4Moo
+2  A: 

Here is my stab at it:

Basically you need activities to be associated to 1 (contact or Customer) and 1 employee that is to be a responsible person for the activity. Note you can handle referential constraint in a model like this.

Also note I added a businessEntity table that connects all People and places. (sometimes useful but not necessary). The reason for putting the businessEntity table is you could simple reference the ResponsiblePerson and the Recipient on the activity to the businessEntity and now you can have activities preformed and received by any and all people or places.

alt text

John Hartsock
+4  A: 

It's not clear to me why you are defining Customers and Contacts as separate entities, when they seem to be versions of the same entity. It seems to me that Customers are Contacts with additional information. If at all possible, I'd create one table of Contacts and then mark the ones that are Customers either with a field in that table, or by adding their ids to a table Customers that has the extended singleton customer information in it.

If you can't do that (because this is being built on top of an existing system the design of which is fixed) then you have several choices. None of the choices are good because they can't really work around the original flaw, which is storing Customers and Contacts separately.

  1. Use two columns, one NULL, to allow referential integrity to work.

  2. Build an intermediate table ActivityContacts with its own PK and two columns, one NULL, to point to the Customer or Contact. This allows you to build a "clean" Activity system, but pushes the ugliness into that intermediate table. (It does provide a possible benefit, which is that it allows you to limit the target of activities to people added to the intermediate table, if that's an advantage to you).

  3. Carry the original design flaw into the Activities system and (I'm biting my tongue here) have parallel ContactActivity and CustomerActivity tables. To find all of an employee's assigned tasks, UNION those two tables together into one in a VIEW. This allows you to maintain referential integrity, does not require NULL columns, and provides you with a source from which to get your reports.

Larry Lustig
A: 
Actions
Activity_ID | Description | Recipient ID
-------------------------------------
11    | Don't ask questions | 0
12    | Be cool    | 1

Activities
ID | Description
----------------
11  | Shoot
12  | Ask out

People
ID | Type | email | phone | GPS |....
-------------------------------------
0  | Troll | [email protected] | 232323 | null | ...
1  | hottie | [email protected] | 2341241 | null | ...


select at.description,a.description, p.* from Activities at, Actions a, People p
where a."Recipient ID" = p.ID 
  and at.ID=a.activity_id

result:

Shoot | Don't ask questions | 0 | Troll | [email protected] | 232323 | null | ...  
Ask out | Be cool | 1 | hottie | [email protected] | 2341241 |null | ...
Rondo
+1  A: 

alt text

Damir Sudarevic
+2  A: 

If I've read the case right, Recipients is a generalization of Customers and Contacts.
The gen-spec design pattern is well understood.

http://stackoverflow.com/questions/3879806/data-modeling-question/3880673#3880673

Walter Mitty
A: 

Model another Entity: ActivityRecipient, which will be inherited by ActivityRecipientContact and ActivityRecipientCustomer, which will hold the proper Customer/Contact ID.

The corresponding tables will be:

Table: Activities(...., RecipientID)

Table: ActivityRecipients(RecipientID, RecipientType)

Table: ActivityRecipientContacts(RecipientID, ContactId, ...,ExtraContactInfo...)

Table: ActivityRecipientCustomers(RecipentID, CustomerId, ...,ExtraCustomerInfo...)

This way you can also have different other columns for each recipient type

Andy
A: 

I would revise that definition of Customer and Contact. A customer can be either an person or a business, right? In Brazil, there's the terms 'pessoa jurídica' and 'pessoa física' - which in a direct (and mindless) translation become 'legal person' (business) and 'physical person' (individual). A better translation was suggested by Google: 'legal entity' and 'individual'.

So, we get an person table and have an 'LegalEntity' and 'Individual' tables (if there's enough attributes to justify it - here there's plenty). And the receiver become an FK to Person table.

And where has gone the contacts? They become an table that links to person. Since a contact is a person that is contact of another person (example: my wife is my registered contact to some companies I'm customer). People can have contacts.

Note: I used the word 'Person' but you can call it 'Customer' to name that base table.

Fabricio Araujo