views:

216

answers:

2

I'm trying to understand rules of relational model as originally defined by Edgar Codd in 1970.

Specifically I'm interested whether referential integrity is part of his relational model or not. I'll try to demonstrate on following example (just to make this question pretty):

Customers

+------+------------
| Name | Address
|------+------------
| John | ....
| Mike | ....
| Kate | ....
+------+------------

Invoices

+------+------------
|  ID  | Customer
|------+------------
|   1  | John
|   2  | John
|   3  | Mary
+------+------------

Now, obviously as you can see, we have one invoice where customer (foreign key) is Mary. Would this violate his relational model? Would Edgar Codd look at this and say, gee, what the heck? Or would he say, it's perfectly fine...

This is theoretical question.

+2  A: 

If there is no customer named Mary in the Customers table, then there is no referential integrity between the tables. Specifically, a foreign key refers to a non-existent primary key.

Does this break the relational model? No. It's defined in the relational model (i.e. lack of referential integrity) and is an indication that there is a problem with the underlying data.

Michael Todd
I'm trying to find on google "relational model" "lack of referential integrity" but it doesn't return anything. I would really like to read about this. How do you know it's defined in the relational model? Have you got any reference?
lubos hasko
I'm currently taking a class at UNLV specifically for database development (from the MIS-side, not the CS-side), so I just rattled the answer off. I don't have my books with me right now but I'll try to find something for you.
Michael Todd
Hmmm...in my quick search I found nothing "high-level" enough to explain my position (just specifics about DB design or one-liners). You may want to try accessing Codd's documents on www.acm.org (requires a membership, though).
Michael Todd
Gotta love the internet. http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf is the text of Codd's "A Relational Model of Data for Large Shared Data Banks." On the last page it talks about inconsistencies, how they occur, and ways to resolve them.
Michael Todd
+1  A: 

I read the following as clearly stating that referential integrity is included in the relational model:

Two integrity rules apply to every relational database:

1 Entity integrity:
No mark of either type is permitted in any attribute which is a component of the primary key of a base relation

2 Referential integrity:
Let D be a domain from which one or more single-attribute primary keys draw their values. Let K be a foreign key which draws its values from domain D. Every unmarked value which occurs in K must also exist in the database as a value in the primary key of some base relation.

"Missing information (applicable and inapplicable) in relational databases," E. F. Codd, ACM SIGMOD Record, vol. 15, no. 4, pp. 53-78, 1986.

By "mark of either type" he is referring to an unknown value, for which we use NULL today. This paper suggested two different types of unknown values, one for "applicable but missing," and one for "inapplicable."

By "unmarked" he means not NULL.

Bill Karwin

related questions