views:

408

answers:

4

I looking for the best way to check for inter-table constraints an step forward of foreing keys. For instance, to check if a date child record value is between a range date on two parent rows columns. For instance:

Parent table  
ID    DATE_MIN   DATE_MAX
----- ---------- ----------
1     01/01/2009 01/03/2009
...

Child table
PARENT_ID  DATE
---------- ----------
1          01/02/2009
1          01/12/2009   <--- HAVE TO FAIL!
...

I see two approaches:

  • Create materialized views on-commit as shown in this article (or other equivalent on other RDBMS).
  • Use stored-procedures and triggers.

Any other approach? Which is the best option?

UPDATE: The motivation of this question is not about "putting the constraints on database or on application". I think this is a tired question and anyone does the way she loves. And, I'm sorry for detractors, I'm developing with constraints on database. From here, the question is "which is the best option to manage inter-table constraints on database?". I'm added "inside database" on the question title.

UPDATE 2: Some one added the "oracle" tag. Of course materialized views are oracle-tools but I'm interested on any option regardless it's on oracle or others RDBMSs.

+4  A: 

EDIT: Dana the Sane deleted his post which was to put it in the data layer regardless of DBA objections.


The reasons DBAs scream at Developers like Dana is they assume that there's a 1:1 ratio between applications and databases. They see this because they see the data as there to support the app and their app only needs to store data in one place.

DBA's see the data as the most important thing, and don't care if the app comes or goes.

If you lost the use of MS Word, would you care if you could still get to your documents? No, the data is important to you, the app isn't.

If you ever let anything bypass your app, to get to your data, you've lost your constraints in your data layer. If your constraints are in your database layer, a dozens apps would all be able you use it.

Ideally, you'd never grant INSERT, UPDATE or DELETE to anyone. Instead you'd grant EXECUTE On packages that will do the CRUD for you. If you do this from the beginning, the ability to add rules to the INSERT of a CHILD (like checking if birth is between parent dates) is virtually infinite.

I'm updated the question to clarify the target.
FerranB
Yes and I said, within the DB use CRUD packages, not triggers or mviews or something else. Crud solves most of it.
CRUD works well. For single tables. But when you're dealing with rules that apply across multiple tables (as in this question), you end up repeating a lot of logic if you don't enforce at lower levels.
Damien_The_Unbeliever
No, CRUD works well for anything. You can have INS/UPD/DEL for some complex object that's stored in 5 tables. There's no requirement that it be 1pkg:1tab. In this case, the add_child CRUD would accept the parent ID's so the dates could be checked against the parents. What's lower than the database?
I think you're limiting CRUD possibilities based on some prejudice. (the actual meaning of that word, not the pejorative)
upd_parent needs to ensure it doesn't invalidate children. upd_child needs to ensure it doesn't move the child outside of valid ranges. That's three places to enfore a rule, which could have been enforced with actual DRI, in one place.
Damien_The_Unbeliever
As in, when I was referring to lower levels - it's better to enforce with CHECK/FK/UNIQUE constraints. If that aint possible, constrain at the Trigger level. Only after that, conside Stored Procs for enforcement.
Damien_The_Unbeliever
Ugh... triggers. Please no. Debugging through triggers is such a nightmare. Hey proc_a runs slow... hmm, all it does is insert one row, why would that be slow... ah there's 100 cascading triggers. ugh. Whereas if all the code in those 100 triggers were in procs, the debugging would be piece of pie.
+1  A: 

I'd go the stored proc and trigger route; one of their major purposes is to ensure data integrity at the DB level.

Most databases also have some form of check constraints, in which pretty much anything you can put in a WHERE clause can be used as a check against the data:

CREATE FUNCTION CheckFnctn()
RETURNS int
AS 
BEGIN
   DECLARE @retval int
   SELECT @retval = COUNT(*) 
   FROM PARENT
   INNER JOIN CHILD ON PARENT.ID = CHILD.PARENT_ID
   WHERE CHILD.DATE < PARENT.DATE_MIN OR CHILD.DATE > PARENT.DATE_MAX
   RETURN @retval
END;
GO
ALTER TABLE CHILD
ADD CONSTRAINT chkDates CHECK (dbo.CheckFnctn() = 0 );
GO
Lurker Indeed
My two main concerns with that are 1) You're possibly incurring a full table scan for every insert/update, and 2) If the Parent table can be updated, you need to have the same check constraint run on there too.
Damien_The_Unbeliever
A: 

Okay, in the specific example, I'd go for redundantly storing redundant data. Through a combination of CHECKs and FKs (and super keys), we ensure that the data is always correct, then we wrap a view and triggers around this to hide the implementation details:

create table dbo.Parents (
    ParentID int IDENTITY(1,1) not null,
    ValidFrom datetime not null,
    ValidTo datetime not null,
    /* Natural Key column(s) */
    CONSTRAINT PK_dbo_Parents PRIMARY KEY (ParentID),
    CONSTRAINT UQ_dbo_Parents_DRI UNIQUE (ParentID, ValidFrom, ValidTo),
    /* Unique constraint on Natural Key */
    CONSTRAINT CK_dbo_Parents_ValidDates CHECK (ValidFrom <= ValidTo) /* Semi-open interval */
)
go
alter table dbo.Parents add constraint DF_dbo_Parents_ValidFrom DEFAULT (CURRENT_TIMESTAMP) for ValidFrom
go
alter table dbo.Parents add constraint DF_dbo_Parents_ValidTo DEFAULT (CONVERT(datetime,'99991231')) for ValidTo
go
create table dbo._Children (
    ChildID int IDENTITY(1,1) not null, /* We'll need this in the update trigger */
    ParentID int not null,
    ChildDate datetime not null,
    _ParentValidFrom datetime not null,
    _ParentValidTo datetime not null,
    CONSTRAINT PK_dbo__Children PRIMARY KEY (ChildID),
    CONSTRAINT FK_dbo__Children_Parents FOREIGN KEY (ParentID,_ParentValidFrom,_ParentValidTo) REFERENCES dbo.Parents (ParentID,ValidFrom,ValidTo) ON UPDATE CASCADE,
    CONSTRAINT CK_dbo__Children_ValidDate CHECK (_ParentValidFrom <= ChildDate and ChildDate < _ParentValidTo) /* See, semi-open */
)
go
alter table dbo._Children add constraint DF_dbo__Children_ChildDate DEFAULT (CURRENT_TIMESTAMP) for ChildDate
go
create view dbo.Children (ChildID,ParentID,ChildDate)
with schemabinding
as
select ChildID,ParentID,ChildDate from dbo._Children
go
create trigger dbo.T_Children_I on dbo.Children instead of insert
as
begin
    set nocount on

    insert into dbo._Children (ParentID,ChildDate,_ParentValidFrom,_ParentValidTo)
    select i.ParentID,i.ChildDate,p.ValidFrom,p.ValidTo
    from
     inserted i
      inner join
     dbo.Parents p
      on
       i.ParentID = p.ParentID
end
go
create trigger dbo.T_Children_U on dbo.Children instead of update
as
begin
    set nocount on
    if UPDATE(ChildID)
    begin
     RAISERROR('Updates to ChildID are not allowed',16,1)
     return
    end

    update c
    set
     ParentID = i.ParentID,
     ChildDate = i.ChildDate,
     _ParentValidFrom = p.ValidFrom,
     _ParentValidTo = p.ValidTo
    from
     inserted i
      inner join
     dbo._Children c
      on
       i.ChildID = c.ChildID
      inner join
     dbo.Parents p
      on
       i.ParentID = p.ParentID
end
go
insert into dbo.Parents(ValidFrom,ValidTo)
select '20081201','20090101' union all
select '20090201','20090301'
/* (2 row(s) affected) */
go
insert into dbo.Children (ParentID,ChildDate)
select 1,'20081215'
/* (1 row(s) affected) */
go
insert into dbo.Children (ParentID,ChildDate)
select 1,'20090115'
/*
Msg 547, Level 16, State 0, Procedure T_Children_I, Line 6
The INSERT statement conflicted with the CHECK constraint "CK_dbo__Children_ValidDate". The conflict occurred in database "Play", table "dbo._Children".
The statement has been terminated.
*/
go
update dbo.Parents set ValidTo = '20090201' where ParentID = 1
/* (1 row(s) affected) */
go
insert into dbo.Children (ParentID,ChildDate)
select 1,'20090115'
/* (1 row(s) affected) */
go
update dbo.Parents set ValidTo = '20090101' where ParentID = 1
/*
Msg 547, Level 16, State 0, Line 1
The UPDATE statement conflicted with the CHECK constraint "CK_dbo__Children_ValidDate". The conflict occurred in database "Play", table "dbo._Children".
The statement has been terminated.
*/
go
insert into dbo.Children (ParentID,ChildDate)
select 2,'20090215'
/* (1 row(s) affected) */
go
update dbo.Children set ChildDate = '20090115' where ParentID=2 and ChildDate = '20090215'
/*
Msg 547, Level 16, State 0, Procedure T_Children_U, Line 11
The UPDATE statement conflicted with the CHECK constraint "CK_dbo__Children_ValidDate". The conflict occurred in database "Play", table "dbo._Children".
The statement has been terminated.
*/
go
delete from dbo.Children
/* (3 row(s) affected) */
go
/* Clean up after testing */
drop view dbo.Children
drop table dbo._Children
drop table dbo.Parents
go

This is for SQL Server. Tested on 2005, but should work on at least 2000 and 2008 too. A bonus here is that even if the trigger is disabled (e.g. nested triggers are turned off), You cannot end up with wrong data in the base tables

Damien_The_Unbeliever
+2  A: 

Database constraints

The best way to enforce a database constraint (a constraint which span two or more relations - of which a referential integrity constraint is a particular case with a syntactical shorthand, foreign key/references statements) would be declaratively, by means of the standard SQL statement:

create assertion <name> check (<condition>)

In your case, something like:

create assertion Child_DATE_between_MIN_MAX check (
  not exists (
    select DATE_MIN, DATE, DATE_MAX
      from Parent, Child
     where ID = PARENT_ID
       and DATE < DATE_MIN
       and DATE > DATE_MAX
  )
)

UPDATE: I forgot that <condition> is strictly a boolean, therefore the old code was not correct.

Unfortunately (moderate sarcasm here) most SQL-DBMSes do not implement ASSERTIONs.

So one is left to implement this check procedurally, with stored procedures and triggers or check constraints, if available. In this case one needs to call the same stored procedures for updates to both Parent and Child relations; so one procedure and two triggers or check constraints.

Lurker Indeed's answer shows such solution, but it needs a similar check on Child relation.

Concerns about performances

Damien_The_Unbeliever, in a comment to the same answer, argues:

1) You're possibly incurring a full table scan for every insert/update

Here I will address this objection, because it is very common and may seem valid even for ASSERTIONs (it is likely that this is a popular misconception that persuades users to not ask SQL-DBMS implementors about them, even when they know that it is in the standard).

Well, yes, he is right.. if one uses a DBMS that sucks!

There is an interesting piece of theory that it is possible to apply to integrity maintenance: Differential Relational Calculus (available as .pdf here; you also find adequate treatment of the subject in every decent book about DB theory).

The core idea is that it is possible to enforce integrity constraints often checking only subsets of relations involved by an update. More rigorously, quoting abstract of the linked paper:

... Formal differentiation of first-order sentences is useful in maintaining database integrity, since once a database constraint is expressed as a first-order sentence, its derivative with respect to a transaction provides the necessary and sufficient condition for maintaining integrity. The derivative is often much simpler to test than the original constraint since it maintains integrity differentially by assuming integrity before the transaction, and testing only for new violations. ...

There are other techniques to deal with incremental integrity constraints maintenance. There are no good reasons for DBMS developers to ignore such theory. In fact, the authors of An Amateur's Introduction to Integrity Constraints and Integrity Checking in SQL (.pdf) wrote in the introduction:

1 Introduction

... Commercial relational DBMS products supporting SQL, however (such as, e.g., Oracle [Ora99] or DB2 [IBM99]) do not support the more advanced forms of constraints. Even today, more than 8 years after the SQL'92 standard has been issued, none of these commercial systems supports assertions, the most general form of constraint in SQL! Scientific literature, i.e. research papers, devoted to integrity on the other hand provide a wealth of promising results applicable to very general and powerful forms of integrity constraints. ...

So, please: ask your SQL-DBMS supplier (commercial or free/open source) to implement ASSERTIONs now and with, at least, reasonable performance.

MaD70
Very interesting, I never heard about "Assertions". Anyway, on Oracle the assertions can be achieved through Materialized Views, I think that's the reason why Oracle does not implement it.
FerranB
With a materialized view **only**? How? Anyway resorting to expedients is wrong: ASSERTIONs are in the standard and are the right thing (there are other parts really horrible in the standard that do not merit to be implemented). So they don't have excuses.
MaD70

related questions