ansaurus

Question

How can I break referential integrity briefly, within a transaction, without disabling the foreign key constraint?

Answer 1

+2 A:

You need to use a deferrable constraint (see Chi's answer).
Otherwise, in order to add a value that will fail the foreign key constraint, you have to either disable or drop & re-create the foreign key constraint.

Situations like these employ a surrogate key that can be altered by users as necessary, without impacting referential integrity. To expand on this idea, currently the setup is:

ID (pk)
PARENT_ID (foreign key, references ID column -- making it self referential)

..and the business rules are that ID can change. Which is fundamentally bad from a design perspective - primary key are immutable, unique, and can't be null. So the solution to the situation when you're building your data model is to use:

ID (pk)
PARENT_ID (foreign key, references ID column -- making it self referential)
SURROGATE_KEY (unique constraint)

The SURROGATE_KEY is the column that supports change without affecting referential integrity - the parent & child relationship is intact. This means that a user can tweak the surrogate key to their hearts delight without needing deferred constraints, enable/disable or drop/recreate foreign key constraints, ON UPDATE CASCADE...

As a rule, in data modeling you NEVER display primary key values to the user because of situations like these. For example, I have a client who wants their jobs number to change on the start of the year, with the year at the start of the number (IE: 201000001 would be the first job created in 2010). What happens when the client sells the company, and the new owner needs a different scheme for their accounting? Or, what if the numbering can't be maintained while transitioning to a different database vendor?

OMG Ponies 2010-06-23 21:39:44

I considered adding a surrogate key but then the integrity of my table relies on having a series of triggers to ensure nobody broke anything, rather than a single foreign key constraint... it seems like a frustrating trade-off.

RenderIn 2010-06-23 21:42:39

@Renderln: I loath it too, but telling users "NO!" is what stops me from being a business analyst =)

OMG Ponies 2010-06-23 21:44:33

@OMG Ponies: you have the most memorable name on all of stackoverflow. Every time I see it, I giggle and feel better about my day.

RenderIn 2010-06-23 21:45:55

@OMG - actually, you can. See @Chi's answer.

DCookie 2010-06-23 21:56:14

@DCookie: Worst part is, I remember hearing about deferrable constraints before :/

OMG Ponies 2010-06-23 22:08:12

@OMG, at least you can remember ;-)

DCookie 2010-06-23 22:11:15

@OMG Ponies: Introducing a surrogate key is indeed the best option by far. Probably I'm reading it wrong, but it looks like you are suggesting in your second paragraph that the surrogate key can be altered by users? The purpose of introducing a surrogate key, is that the surrogate key is used for referential integrity and is immutable, while the real/business key can be altered.

Rob van Wijk 2010-06-24 08:15:48

@OMG Ponies: I'd love to see an example here, like @Shannon Severance did elsewhere - I'm arguing with you on the method elsewhere, maybe you could convince me to eat my words AND help educate others? ;-)

Chris Kaminski 2010-06-24 14:52:23

@Rob van Wijk: I prefer natural and artificial key for referential integrity terminology. For this situation, surrogate is accurate - the surrogate is a substitute reference for the user, in this case that they can change. If the surrogate is also immutable, then the OP is back to where they started but with an additional column ;)

OMG Ponies 2010-06-24 15:59:35

@Chris Kaminski: Your mission, should you choose to accept it, is to proofread my updated answer.

OMG Ponies 2010-06-24 16:01:09

Digesting - I appreciate you taking the time to add this. :-)

Chris Kaminski 2010-06-24 16:42:17

Answer 2

+14 A:

What you want is a 'deferrable constraint'.

You can pick between the two types of deferrable constraints, 'INITIALLY IMMEDIATE' and 'INITIALLY DEFERRED' to drive default behavior - whether the database should default to check the constraint after every statement, or if it should default to only checking constraints at the end of the transaction.

Chi 2010-06-23 21:46:40

sorry, link should be fixed now

Chi 2010-06-23 21:48:17

The link is broken.

Shannon Severance 2010-06-23 21:49:06

Links works now, but is incredibly slow to load (on my highspeed connection). Only concern is what versions of Oracle this applies to - can't tell from the link.

OMG Ponies 2010-06-23 21:54:15

The Oracle 9.2 version of that documentation describes deferrable constraints as well, so it should be available at least from 9.2 - http://download.oracle.com/docs/cd/B10501_01/server.920/a96524/c22integ.htm#4666

Chi 2010-06-23 21:58:57

+1: For answering the question, but the data model should be constructed so hacks like these aren't necessary.

OMG Ponies 2010-06-23 22:04:45

This doesn't seem like a hack to me - it seems like a reasonable expectation to support. Either you want some lee-way in your ACID, or you don't. It would really be up to the user to decide.

Chris Kaminski 2010-06-23 23:50:08

@Chris Kaminski: No DBA would ever let you get away with using deferred constraints. What I suggested in my answer about using a surrogate key that the user can alter at will, without impacting referencial integrity, would've stopped this need from occurring in the first place.

OMG Ponies 2010-06-24 00:26:03

@Chi This solves exactly the problem I was facing. I'm getting a bit scared based on what others are saying about it though, so perhaps I need to consider using a surrogate key and then just accepting that the requirement will be on my code to enforce this integrity in my insert/update/delete procedures and/or triggers. I wish there was a less messy, "correct" way to do this.

RenderIn 2010-06-24 13:41:54

One caveat about deferred constraints: if the constraint is not valid when you try to commit, the transaction will automatically rollback. This is one of the few places where the database takes transactional control out of the user's (or application's) hands.

Allan 2010-06-24 14:17:32

@OMG Ponies: then the DBAs are idiots. As long as I complete my transaction with referential integrity intact, it shouldn't matter how I got there. And your surrogate keys is a horrible hack to get around DBAs with more ego than sense, IMHO, or software that just isn't robust enough to manage it (MySql?).

Chris Kaminski 2010-06-24 14:52:45

@OMG Ponies: I'll agree that hacks like this "shouldn't" be needed, but I have encountered situations where, due to legacy design issues (i.e. "What moron came up with this POS?!?"), deferred constraints were the only reasonable way to allow the data to be properly constrained without having to write some really awful code. YMMV.

Bob Jarvis 2010-06-24 15:04:33

@Bob Jarvis: Such situations remind me of this poster: http://www.despair.com/mis24x30prin.html

OMG Ponies 2010-06-24 16:07:23

@Chris Kaminski: Good luck fighting against DBAs, it's not their fault the situation existed in the first place.

OMG Ponies 2010-06-24 16:09:45

@OMG Ponies: Concur with the fighting against DBAs bit... :)

Chris Kaminski 2010-06-24 16:39:48

@Chris Kaminski what you don't seem to understand is that the DBA's job is to protect the data not make things more convenient for the programmers. That doesn't make them idiots.

HLGEM 2010-06-24 17:05:45

Answer 3

+7 A:

Slower than Chi, but felt it would be nice to include code sample, so that the answer could be found on SO.

As Chi answered, deferrable constraints make this possible.

SQL> drop table t;

Table dropped.

SQL> create table T (ID number
  2      , parent_ID number null
  3      , name varchar2(40) not null
  4      , constraint T_PK primary key (ID)
  5      , constraint T_HIREARCHY_FK foreign key (parent_ID)
  6          references T(ID) deferrable initially immediate);

Table created.

SQL> insert into T values (1, null, 'Big Boss');

1 row created.

SQL> insert into T values (2, 1, 'Worker Bee');

1 row created.

SQL> commit;

Commit complete.

SQL> -- Since initially immediate, the following statement will fail:
SQL> update T
  2  set ID = 1000
  3  where ID = 1;
update T
*
ERROR at line 1:
ORA-02292: integrity constraint (S.T_HIREARCHY_FK) violated - child record found


SQL> set constraints all deferred;

Constraint set.

SQL> update T
  2  set ID = 1000
  3  where ID = 1;

1 row updated.

SQL> update T
  2  set parent_ID = 1000
  3  where parent_ID = 1;

1 row updated.

SQL> commit;

Commit complete.

SQL> select * from T;

        ID  PARENT_ID NAME
---------- ---------- ----------------------------------------
      1000            Big Boss
         2       1000 Worker Bee

SQL> -- set constraints all deferred during that transaction
SQL> -- and the transaction has commited, the next
SQL> -- statement will fail
SQL> update T
  2  set ID = 1
  3  where ID = 1000;
update T
*
ERROR at line 1:
ORA-02292: integrity constraint S.T_HIREARCHY_FK) violated - child record found

I believe, but could not find the reference, that deferrability is defined at constraint creation time and can not be modified later. The default is non-deferrable. To change to deferrable constraints you'll need to do a one time drop and add constraint. (Properly scheduled, controlled, etc.)

SQL> drop table t;

Table dropped.

SQL> create table T (ID number
  2      , parent_ID number null
  3      , name varchar2(40) not null
  4      , constraint T_PK primary key (ID)
  5      , constraint T_HIREARCHY_FK foreign key (parent_ID)
  6          references T(ID));

Table created.

SQL> alter table T drop constraint T_HIREARCHY_FK;

Table altered.

SQL> alter table T add constraint T_HIREARCHY_FK foreign key (parent_ID)
  2      references T(ID) deferrable initially deferred;

Table altered.

Shannon Severance 2010-06-23 22:28:03

Answer 4

+1 A:

If this were any other database besides Oracle, you could declare the foreign key with ON UPDATE CASCADE. Then if you change a parent's id, it would propagate the change atomically to the child's parent_id.

Unfortunately, Oracle implements cascading deletes but not cascading updates.

(This answer is for information purposes only, since it doesn't actually solve your problem.)

Bill Karwin 2010-06-23 23:02:47

Nice addition. Although I don't agree with your word "Unfortunately" as I think it should be discouraged. Primary keys should be immutable.

Rob van Wijk 2010-06-24 07:28:33

@Rob van Wijk: I'd agree that *pseudokeys* should be immutable. What about natural primary keys?

Bill Karwin 2010-06-24 08:10:08

The only acceptable natural primary key to me, is one that will never ever change. That's because I don't want to be bothered updating half of the entire datamodel, with the associated downtime. That's hard to explain to the business. And, immutable natural keys are rare. Even some natural keys that seem immutable at first sight, change. Just think about country codes (remember Yugoslavia,Zaire), currency codes (Euro), bank account numbers (changing from 9 digits to 10) and so on. An example of an acceptable natural key I can think of: the date column of a DW time table.

Rob van Wijk 2010-06-24 09:02:09

@RobVanWijk - " An example of an acceptable natural key I can think of: the date column of a DW time table. " Until the next Pope Gregory XIII ...

APC 2010-06-24 11:43:10

@APC: Lol. That would certainly cause protests and riots worldwide by IT companies. "You can't do that, because we have the date as a natural primary key". That's what I call evolution of the human race :-)

Rob van Wijk 2010-06-24 13:23:59

Answer 5

+3 A:

The common advice with scenarios like this is to employ deferrable constraints. However, I think these situations are almost always a failure of application logic or data model. For instance, inserting a child record and a parent record in the same transaction can be a problem if we execute it as two statements:

My test data:

SQL> select * from t23 order by id, parent_id
  2  /

        ID  PARENT_ID NAME
---------- ---------- ------------------------------
       110            parent 1
       111            parent 2
       210        110 child 0
       220        111 child 1
       221        111 child 2
       222        111 child 3

6 rows selected.

SQL>

The wrong way to do things:

SQL> insert into t23 (id, parent_id, name) values (444, 333, 'new child')
  2  /
insert into t23 (id, parent_id, name) values (444, 333, 'new child')
*
ERROR at line 1:
ORA-02291: integrity constraint (APC.T23_T23_FK) violated - parent key not
found


SQL> insert into t23 (id, parent_id, name) values (333, null, 'new parent')
  2  /

1 row created.

SQL>

However, Oracle supports a multi-table INSERT synatx which allows us to insert the parent and child records in the same statement, thus obviating the need for deferrable constraints:

SQL> rollback
  2  /

Rollback complete.

SQL> insert all
  2      into t23 (id, parent_id, name)
  3          values (child_id, parent_id, child_name)
  4      into t23 (id, name)
  5          values (parent_id, parent_name)
  6  select  333 as parent_id
  7          , 'new parent' as parent_name
  8          , 444 as child_id
  9          , 'new child' as child_name
 10  from dual
 11  /

2 rows created.

SQL>

The situation you are in is similar: you want to update the primary key of the parent record but can't because of the existence of the child records: And you can't update the child records because there is no parent key. Catch-22:

SQL> update t23
  2      set id = 555
  3  where id = 111
  4  /
update t23
*
ERROR at line 1:
ORA-02292: integrity constraint (APC.T23_T23_FK) violated - child record found


SQL> update t23
  2      set parent_id = 555
  3  where parent_id = 111
  4  /
update t23
*
ERROR at line 1:
ORA-02291: integrity constraint (APC.T23_T23_FK) violated - parent key not
found


SQL>

Once again the solution is to do it in a single statement:

SQL> update t23
  2      set id = decode(id, 111, 555, id)
  3          , parent_id = decode(parent_id, 111, 555, parent_id)
  4  where id = 111
  5     or parent_id = 111
  6  /

4 rows updated.

SQL> select * from t23 order by id, parent_id
  2  /

        ID  PARENT_ID NAME
---------- ---------- ------------------------------
       110            parent 1
       210        110 child 0
       220        555 child 1
       221        555 child 2
       222        555 child 3
       333            new parent
       444        333 new child
       555            parent 2

8 rows selected.

SQL>

The syntax in the UPDATE statement is a bit clunky but kludges usually are. The point being that we should not have to update primary key columns very often. Indeed, as immutability is one of the characteristics of "primary key-ness" we shouldn't really have to update them at all. Needing to do so is a failure of the data model. One way of avoiding such failures is to use a synthetic (surrogate) primary key, and simply enforce the uniqueness of the natural (aka business) key with a unique constraint.

So why does Oracle offer deferrable constraints? They are useful when we undertake data migrations or bulk data uploads. They permit us to cleanse data in the database without staging tables. We really shouldn't need them for regular application tasks.

APC 2010-06-24 11:39:49

Thank you for posting this. I had tested the single update to both parent and children, and it had failed. I thought it wouldn't work. Evidently there was an error in my test. Yours works fine.

Shannon Severance 2010-06-24 16:36:05

Answer 6

+4 A:

Recommendations to use a surrogate key are excellent, IMO.

More generally, the problem with this table is that it lacks a primary key. Recall that a primary key must be three things:

Unique
Non-null
Unchanging

Databases I'm familiar with enforce (1) and (2), but I don't believe they enforce (3), which is unfortunate. And that's what's kicking you in the butt - if you change your "primary key" you have to chase down all the references to that key field and make equivalent alterations if you don't want to break integrity. The solution, as others have said, is to have a true primary key - one that is unique, non-null, and which doesn't change.

There's reasons for all these little rules. This is a great opportunity to understand the "unchanging" part of the primary key rules.

Share and enjoy.

Bob Jarvis 2010-06-24 15:18:04

ansaurus

tags:

views:

answers:

How can I break referential integrity briefly, within a transaction, without disabling the foreign key constraint?

related questions