views:

94

answers:

4

Hi,

My colleague mentioned that our client DBA proposed the removal of all foreign key constraints in our project Oracle DB schema. Initially I did not agree with the decision. I am a developer not a DBA. So later realized that there could be some reasons behind the decision. So I am trying get the pros and cons of this decision.

Proj info:

  1. Spring application with Hibernate persistent.
  2. Oracle 10g DB
  3. There are batch jobs use only SQL-loader or plain JDBC.

Here is my list of pros and cons (Please correct me if I am wrong)

Pros:

  1. Since application persistent is managed by Hibernate, foreign key cascading is not necessary. it is managed by Hibernate with appropriate cascading option.

  2. Hibernate DELETE action(includes delete cascading option) removes the foreign key table records before removing its primary key record (i.e to avoid referential integrity issue). This behavior is same for no-foreign-key case, foreign-key case and foreign-key-with-cascade case. But adding foreign-key will unnecessarily slow down Oracle delete operation.

Cons

  1. Hibernate provides a mechanism for managing association between objects and cascading operations within association. But it never provides complete referential integrity solution that DB has.

  2. Referential integrity is required for those batch jobs use only SQL-loader or plain JDBC.

Guys, I need your advice on this. If anyone of you are a DBA, please provide DBA side reasons.

Thank you.

A: 

Usually, foreign key removal is what database performance optimization starts with. It's kind of trade-off: you sell guaranteed integrity on DBMS level and have to manage it yourself (which is fairly easy with Hibernate but requires to be very accurate in plain SQL), and you get increased query performance since foreign key checks in queries are quite expensive.

Vadim Fedorov
-1 for "you get increased query performance since foreign key checks in queries are quite expensive".
Rob van Wijk
-1 No you don't: foreign key constraints cost NOTHING in queries, only in inserts, updates and deletes.
Tony Andrews
@Tony: I agree, but I've also discovered that for some people (and some DBMS's), all DML statements are called queries, and there are 4 types: select queries, insert queries, update queries and delete queries.
AndyDan
-1 the optimizer is blinded to certain possibilities.
Stephanie Page
+5  A: 

I've worked on databases in projects that decided to drop referential integrity constraints.

We had to write "QC script" to detect orphaned rows with respect to every table relationship (orphaned rows would have been prevented by a foreign key constraint).

Then when (not if) they occured, we had to have policies for how to resolve the orphans. Choices included the following:

  • Delete orphaned rows.
  • Archive orphaned rows.
  • Update any orphaned foreign key values to NULL.
  • Update any orphaned foreign key values to some existing value in the parent table.
  • Live with the anomalies. Write more code to exclude orphans from reports. Maybe a set of VIEWs over all the tables?

You might want to schedule a recurring weekly meeting with the stakeholders of this database to review the QC script report, and decided what to do with each of the orphaned rows.

No framework can enforce referential integrity as reliably as constraints that run in the database. Only the database can provide truly atomic changes and ensure consistency.

Bill Karwin
So what advantage did you gain in exchange for all of that work?
Stephanie Page
@Stephanie Page: The point of my story is that dropping the RI constraints resulted in far too much work. Properly using foreign key constraints prevents many types of anomalies and reduces your work.
Bill Karwin
@Bill, I know your work so I assumed that's what you meant... but it looked almost like a recipe for how to do it rather than an admonition against it. To me and you it looks pretty stupid to do all that work, but some people are so dead-set against RI, you've just given them a project plan. So I was playing the straight man... "Gee Bill, was it worth it?"
Stephanie Page
@Stephanie Page: Aha, that just goes to show that it's hard to convey irony in a text-only medium. Anyway the OP seemed to understand that this anecdote shows why it's *not* worth it.
Bill Karwin
+6  A: 

I have never heard such a proposal from a DBA before! From an application developer, yes, but never from a Database Administrator. It beggars belief.

Tom Kyte has said many times (for example here): applications come and go, but data is forever.

In my own experience, I have worked on Oracle databases that are 20+ years old. They started out in Oracle 6 and got migrated up to 10G or 11g over the years - the same data. But the applications that sat on top? First they were Forms 3.0, then in some cases they got migrated to C++, in some got re-built in Forms 6i, in some rebuilt in Application Express. ADF is another possibility of course; or perhaps a SOA architecture...

What's so special about the current application development tool that it suddenly takes over Oracle's job as the DBMS?

Tony Andrews
+1 Implement FK functionality at the application layer and you'll get to do it all over when the next new and shiny tool/framework comes along.
dpbradley
Every inexperienced developer wants to break the Relational Model because surely the work of billions of person-hours of developing millions of databases over the last 40 years is surely filled with mistakes and errors, tons of inefficiencies and vestigial constructs. My "app", "middle tier", "res file", "data access layer", "poodle" can certainly be more "efficient", "useful", "responsive" at "constraint enforcement", "referential integrity", "indexing", "grouping", etc
Stephanie Page
+4  A: 

Since database constraints are guaranteed they can, in some circumstances, allow additional optimizations.

For example, say you have a view

CREATE VIEW orders_vw AS
SELECT ord.order_id, ord.customer_id, lin.product_id
FROM orders ord JOIN order_lines lin on ord.order_id = lin.order_id

Then you have a query that does a SELECT product_id FROM orders_vw WHERE order_id = :val With the integrity enforced, the database knows that any order_id in order_lines has one row in the parent table and, since no value from the orders table are actually selected, it can save work by not visiting the orders table. Without the constraint, the database can't be sure that an entry in order_lines has a parent, so it has to do the extra work of visiting the orders table to check it.

Depending on your query patterns, you may find removing constraints actually increases the workload on the DB.

Gary
+1 but that's not exactly true. Some databases allow you to create constraints that are not enforced by the DB but can be used by the optimizer. In Oracle they are called "belief" constraints and use the RELY keyword.
Stephanie Page