views:

65

answers:

4

Imagine you have the following table (note: this is a contrived/simplified example):

CREATE TABLE foo (    
  book_id number,
  page number,
  -- [a bunch of other columns describing a single page in a book]
);

ALTER TABLE foo
ADD (CONSTRAINT foo_pk PRIMARY KEY(book_id, page));

While (book_id, page) pairs are unique, the same page number will be repeated between books (many books will have a page 1). Therefore, if a SQL query doesn't specify a book_id, the wrong page(s) may be selected/updated/deleted. All of our queries should act on just one book at a time, but I've seen a couple bugs where the book_id parameter was accidentally omitted.

Is there a programmatic way to enforce that every select, insert, update, etc query specifies a book_id in the where clause?

We generate the SQL code for the queries dynamically and execute them using Spring's JdbcTemplate. The database is Oracle. Using automated tests to check that the many possible queries (plus new ones that get added in the future!) don't get tripped up by duplicate page_ids is tricky. I could override the JdbcTemplate code to ensure the sql queries always include a book_id parameter, but that involves manually parsing SQL code (especially tricky with subqueries) and seems hacky. Is there a more robust solution to enforce this? Some trigger, stored procedure, constraint?

+2  A: 

You could use a function or stored procedure instead of using UPDATE directly. The procedure takes 2 parameters and throws an error if either is null.

The other option is to make sure that the queries you generate always have the book_id constraint. I hope you're not creating the whole SQL statement as a String and that you're using parameterized queries. If you're not, then using parameterized queries is a good way to make sure you pass a book_id always (if you leave a parameter unset, the query won't run). Plus, you're not at risk if you don't sanitize your input when you use parameterized queries.

Chochos
Using stored procedures instead of queries would require changing an awful lot of code, but I suppose I could do it if no other solution is possible.
Yevgeniy Brikman
Oh, and we are using parameterized queries, the problem is that there are quite a few of them, new ones added all the time, and occasionally someone forgets the book_id parameter in the where clause. In other words, I'm looking for a solution that helps in case the programmer didn't write the query properly.
Yevgeniy Brikman
+3  A: 

A common way to protect your database against programmer error is to require that applications use stored procedures. (Sometimes this can be done using permissions.)

It's much easier to inspect your procs for compliance than ad hoc queries.

egrunin
A: 

The only way I can think of doing it is to replace the book_id and page columns in the table with a single column that store both pieces of information — something like (book_id*10000 + page) if you want in integer column or "book_id-page" for a string column.

This is a bad idea from a correctness point of view (two attributes stored in one column), but would force your programmers to use both attributes to interact with the table. If that's a big enough concern to you, you might consider it.

Larry Lustig
+1  A: 

First of all, this is really a testing issue - it is not users who will make the mistake, it is developers, and their mistakes should be caught before the application goes live.

Having said that, you could trap such updates via a combination of triggers:

  • A statement-level BEFORE trigger to initialise a package variable g_book_id to null
  • A row-level trigger to (a) check that the book_id being updated matches that in the package variable (if not null), and (b) intitalises the package variable if it is null.

A simple example:

SQL> create table t1 (id int, col2 int);

Table created.

SQL> insert into t1 values(1, null);

1 row created.

SQL> insert into t1 values(2, null);

1 row created.

SQL> create package p1 is g_id integer; end;
  2  /

Package created.

SQL> create trigger t1_bus
  2  before update on t1
  3  begin
  4    p1.g_id := null;
  5* end;
SQL> /

Trigger created.

SQL> create trigger t1_bir
  2  before update on t1
  3  for each row
  4  begin
  5     if :new.id != p1.g_id then
  6       raise_application_error(-20000,'You can only update 1 ID at a time');
  7     end if;
  8     p1.g_id := :new.id;
  9  end;
 10  /

Trigger created.

SQL> update t1 set col2=1 where id=1;

1 row updated.

SQL> update t1 set col2=2 where id=2;

1 row updated.

SQL> update t1 set col2=3; -- ID not specified
update t1 set col2=3
       *
ERROR at line 1:
ORA-20000: You can only update 1 ID at a time
Tony Andrews
He wants this for `SELECT`, as well...
egrunin
True, this only addresses updates and deletes, another approach would be required for selects. But I can't see how it could ever make sense to restrict selects to a single book at a time - how would the user evere see a list of books to choose one from? Or count how many books there were on a certain subject? Or ... well, you get the idea.
Tony Andrews
Interesting approach, thanks. I can't imagine there is a trigger based approach that would work for `SELECT` too? As I said in a comment above, the actual domain doesn't deal with books (I modified table/column names for ease of discussion) and in the real domain, queries across multiple "books" just don't make sense.
Yevgeniy Brikman
Is each user always using the same bookid? If so maybe VPD is the solution?
Tony Andrews