views:

392

answers:

7

I have a particularly difficult business constraint that I would like to enforce at the database level. The data is financial in nature, and so must be protected from inconsistencies to the nth degree – no trusting the business layer with this stuff. I use the word "temporal" somewhat loosely, meaning that I intend to control how an entity can and cannot change over time.

Glossing over the details, here's the design:

  • An invoice can feature several fees.
  • Fees are assigned to an invoice shortly after creation of the invoice.
  • The invoice reaches a stage in the process after which it is "locked."
  • From this point on, no fee may be added to or removed from this invoice.

Here's a stripped down data definition:

CREATE TABLE Invoices
(
    InvoiceID INT IDENTITY(1,1) PRIMARY KEY,
)

CREATE TABLE Fees
(
    FeeID INT IDENTITY(1,1) PRIMARY KEY,
    InvoiceID INT REFERENCES Invoices(InvoiceID),
    Amount MONEY
)

You'll notice that the "lockable" nature of the invoice isn't represented here; how to represent it – and whether it needs to be directly represented at all – is still an open question.

I have come to believe that this is one of those arrangements that cannot be translated into domain-key normal form, though I may be wrong. (There really is no way to tell, after all.) That said, I still hold out hope for a highly-normalized solution.

I happen to be implementing this on SQL Server 2008 (the syntax may have been a hint), but I'm a curious guy, so if there are solutions that work on other DBMS's, I'd love to hear about those as well.

+8  A: 

I cannot think of a way to do this with normalization. However, if I wanted to constrain this on the database, I'd do it 1 of two ways:

first, I would add a 'locked' column to Invoices, which is a bit of some sort, just a way to key it as locked.

Then, the two ways:

  1. A "before insert" trigger, which would throw an error before an insert is made, if the invoice referred to is locked.
  2. Do this logic in the stored procedure that creates the fee.

EDIT: I couldn't find a good MSDN article on how to do one of these, but IBM has one that works pretty well in SQL Server: http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=/sqlp/rbafybeforesql.htm

Erich
I've thought about this solution, and it does appear to be the most-transparent implementation. That said, it requires at least five triggers: the one you suggested, two on Invoices to prevent unlocking and deleting. and two more on Fees for UPDATE and DELETE. I don't consider that a problem, though.
WCWedin
Damn, I accidentally cancelled my up-vote. Apparently I was too slow in undoing it. Sorry.
WCWedin
This is the correct response. Hopefully my more wordy reply adds some value to the discussion.
Paul Keister
A: 

You can't just by using FK constraints and the like -- at least not in any way that makes much sense. I would suggest using an INSTEAD OF trigger in SQL Server to enforce this constraint. It should be fairly easy to write and pretty straightforward.

Dave Markle
+1  A: 

You can constrain additions to the FEES table by altering your data model to use:

INVOICES

  • INVOICE_ID
  • INVOICE_LOCKED_DATE, null

FEES

  • FEE_ID (pk)
  • INVOICE_ID (pk, fk INVOICES.INVOICE_ID)
  • INVOICE_LOCKED_DATE (pk, fk INVOICES.INVOICE_LOCKED_DATE)
  • AMOUNT

At a glance, it's redundant but as long as an INSERT statement for the FEES table doesn't include a lookup to the INVOICES table for the locked date (default being null) - it ensures that new records have the date the invoice was locked.

Another option is to have two tables regarding fee handling - PRELIMINARY_FEES and CONFIRMED_FEES.

While an invoices fees are still editable, they reside in the PRELIMINIARY_FEES table and once confirmed - are moved to the CONFIRMED_FEES. I don't like this one much for sake of having to maintain two identical tables along with query implications, but it would allow for using GRANTs (on a role, not user, basis) to only allow SELECT access to CONFIRMED_FEES while allowing INSERT, UPDATE, DELETE on the PRELIMINARY_FEES table. You can't restrict grants in a single FEES table setup because a grant isn't data aware - you can't check for a given status.

OMG Ponies
Unless I missed something, simply having a date in the foreign key doesn't prevent adding new fees to an old invoice using the old date. It would throw up a red flag in application code if the date is set explicitly where it should not be, though subtle bugs could still crop up. Your separate tables idea does suggest that perhaps separate views (one updatable, one not) along with appropriate permissions on the source tables could be used to enforce the constraint.
WCWedin
@WDWedin: Yes, that is what I explained regarding INSERTS in that setup. Just how much access will your users have to the database?
OMG Ponies
Tables would be preferred to views. If the data still resides in the same table, then it will still be ultimately editable.
OMG Ponies
Regarding tables vs. views and permissions: Would a user with rights to update a view be barred from doing so if the trigger updates a table they do not have permission to update? My first thought is that the update would go through, but the more I think about it, the more likely it seems that the trigger executes with the same permissions as the user.
WCWedin
Regarding the LockedDate approach, I get the general thrust of the idea. To answer your question, access to the database is theoretically limited to a single application that exposes an API over the web. I'm more afraid of errant application code or clumsy DBA's causing massive damage to finacial data, rather than malicious writes originating from the wild. I could implement your suggestion with cascading constraints and private members in the ORM layer, and that would solve half the problem. I would still like to save future DBA from themselves if I can, though.
WCWedin
It's my understanding that data presented in SQL Server views can not be updated. Some databases allow it though.
OMG Ponies
OMG Ponies
Actually, you can create INSTEAD OF triggers on views as of SQL Server 2000, and there's some finicky method using indexed views in newer versions. I am worried about what DBA's can do; my company has a habit of allowing too many chefs in the database, so to speak. It's pretty frustrating that sysadmins and db_owners are so god-like; only triggers can stand in their way, it seems.
WCWedin
Nothing is perfect - triggers can be disabled, same as constraints. People are human too.
OMG Ponies
True, but DBA's have to take a conscious step to disable a trigger. The flip side is that they also have to take a conscious step to re-enable it, of course. Anyway, thanks for your help and advice!
WCWedin
+1  A: 

I think you will be best off explicitly storing the 'locked/unlocked' state for the invoice in the invoice table, and then apply triggers on INSERT and DELETE (and UPDATE, though you don't actually say you want the fees on the invoice frozen) to prevent modifications if the invoice is in the locked state.

The locked flag is necessary unless there is a reliable algorithmic method to determine when the invoice is locked - perhaps 2 hours after it is produced. Of course, you have to update the invoice row to lock it - so an algorithmic method is better (fewer updates).

Jonathan Leffler
I appreciate the elegance of the algorithmic method, too, but reality keeps that from working for us; invoice-locking is inherently a user-initiated process. You're right that I didn't say that fees need to lock, as well. It seems obvious to me after working on this problem for several days, so I completely forgot to mention it. The idea is that an invoice shouldn't change after it's been sent to the customer – how would they feel if they sent a check, only to find out that the amount due has since changed? Anyway, thanks for your help!
WCWedin
+1  A: 

Why not just have a 'Locked' column that's a boolean (or single char, 'y', 'n' for example) and tweak your update query to use a subquery :

INSERT INTO Fees (InvoiceID, Amount) VALUES ((SELECT InvoiceID FROM Invoices WHERE InvoiceID = 3 AND NOT Locked), 13.37);

Assuming you have a not-null constraint on the InvoiceID column, the insertion will fail when the invoice is locked. You can handle the exception in your code and thus prevent fee additions when the invoice is locked. You'll also avoid having to write and maintain complicated triggers and stored procedures as well.

PS. The insertion query above uses MySQL syntax, I'm afraid I'm not that familiar with SQL Server's TQL variant.

Alex Marshall
I like the elegance of the approach, but it does place the responsibility of data integrity firmly on the application. I could embed the logic in the ORM, but I'm afraid that's likely the be more error-prone and difficult to maintain than a trigger-based approach. That said, this particular query could be of use inside a trigger, if it proves more efficient than an EXISTS clause.
WCWedin
+4  A: 

Don't complicate it, I'd go with triggers. There is no shame in using them, this is what they are there for.

To avoid lots of logic in the triggers, I add an "Editable" bit column into the header table, then basically use a divide with Editable to either work or cause a divide by zero error, which I CATCH and convert to a Invoice is not editable, no changes permitted message. There are no EXISTS used to eliminate extra overhead. Try this:

CREATE TABLE testInvoices
(
     InvoiceID   INT      not null  IDENTITY(1,1) PRIMARY KEY
    ,Editable    bit      not null  default (1)  --1=can edit, 0=can not edit
    ,yourData    char(2)  not null  default ('xx')
)
go

CREATE TABLE TestFees
(
    FeeID     INT IDENTITY(1,1) PRIMARY KEY
   ,InvoiceID INT REFERENCES testInvoices(InvoiceID)
   ,Amount    MONEY
)
go

CREATE TRIGGER trigger_testInvoices_instead_update
ON testInvoices
INSTEAD OF UPDATE
AS
BEGIN TRY
    --cause failure on updates when the invoice is not editable
    UPDATE t 
        SET Editable =i.Editable
           ,yourData =i.yourData
        FROM testInvoices            t
            INNER JOIN INSERTED      i ON t.InvoiceID=i.InvoiceID
        WHERE 1=CONVERT(int,t.Editable)/t.Editable    --div by zero when not editable
END TRY
BEGIN CATCH

    IF ERROR_NUMBER()=8134 --catch div by zero error
        RAISERROR('Invoice is not editable, no changes permitted',16,1)
    ELSE
    BEGIN
        DECLARE @ErrorMessage nvarchar(400), @ErrorNumber int, @ErrorSeverity int, @ErrorState int, @ErrorLine int
        SELECT @ErrorMessage = N'Error %d, Line %d, Message: '+ERROR_MESSAGE(),@ErrorNumber = ERROR_NUMBER(),@ErrorSeverity = ERROR_SEVERITY(),@ErrorState = ERROR_STATE(),@ErrorLine = ERROR_LINE()
        RAISERROR (@ErrorMessage, @ErrorSeverity, @ErrorState, @ErrorNumber,@ErrorLine)
    END

END CATCH
GO


CREATE TRIGGER trigger_testInvoices_instead_delete
ON testInvoices
INSTEAD OF DELETE
AS
BEGIN TRY
    --cause failure on deletes when the invoice is not editable
    DELETE t
    FROM testInvoices            t
        INNER JOIN DELETED       d ON t.InvoiceID=d.InvoiceID
        WHERE 1=CONVERT(int,t.Editable)/t.Editable    --div by zero when not editable
END TRY
BEGIN CATCH

    IF ERROR_NUMBER()=8134 --catch div by zero error
        RAISERROR('Invoice is not editable, no changes permitted',16,1)
    ELSE
    BEGIN
        DECLARE @ErrorMessage nvarchar(400), @ErrorNumber int, @ErrorSeverity int, @ErrorState int, @ErrorLine int
        SELECT @ErrorMessage = N'Error %d, Line %d, Message: '+ERROR_MESSAGE(),@ErrorNumber = ERROR_NUMBER(),@ErrorSeverity = ERROR_SEVERITY(),@ErrorState = ERROR_STATE(),@ErrorLine = ERROR_LINE()
        RAISERROR (@ErrorMessage, @ErrorSeverity, @ErrorState, @ErrorNumber,@ErrorLine)
    END

END CATCH
GO

CREATE TRIGGER trigger_TestFees_instead_insert
ON TestFees
INSTEAD OF INSERT
AS
BEGIN TRY
    --cause failure on inserts when the invoice is not editable
    INSERT INTO TestFees
            (InvoiceID,Amount)
        SELECT
            f.InvoiceID,f.Amount/i.Editable  --div by zero when invoice is not editable
            FROM INSERTED                f
                INNER JOIN testInvoices  i ON f.InvoiceID=i.invoiceID
END TRY
BEGIN CATCH

    IF ERROR_NUMBER()=8134 --catch div by zero error
        RAISERROR('Invoice is not editable, no changes permitted',16,1)
    ELSE
    BEGIN
        DECLARE @ErrorMessage nvarchar(400), @ErrorNumber int, @ErrorSeverity int, @ErrorState int, @ErrorLine int
        SELECT @ErrorMessage = N'Error %d, Line %d, Message: '+ERROR_MESSAGE(),@ErrorNumber = ERROR_NUMBER(),@ErrorSeverity = ERROR_SEVERITY(),@ErrorState = ERROR_STATE(),@ErrorLine = ERROR_LINE()
        RAISERROR (@ErrorMessage, @ErrorSeverity, @ErrorState, @ErrorNumber,@ErrorLine)
    END

END CATCH
GO

CREATE TRIGGER trigger_TestFees_instead_update
ON TestFees
INSTEAD OF UPDATE
AS
BEGIN TRY
    --cause failure on updates when the invoice is not editable
    UPDATE f 
        SET InvoiceID =ff.InvoiceID
           ,Amount    =ff.Amount/i.Editable --div by zero when invoice is not editable
        FROM TestFees                f
            INNER JOIN INSERTED     ff ON f.FeeID=ff.FeeID
            INNER JOIN testInvoices  i ON f.InvoiceID=i.invoiceID
END TRY
BEGIN CATCH

    IF ERROR_NUMBER()=8134 --catch div by zero error
        RAISERROR('Invoice is not editable, no changes permitted',16,1)
    ELSE
    BEGIN
        DECLARE @ErrorMessage nvarchar(400), @ErrorNumber int, @ErrorSeverity int, @ErrorState int, @ErrorLine int
        SELECT @ErrorMessage = N'Error %d, Line %d, Message: '+ERROR_MESSAGE(),@ErrorNumber = ERROR_NUMBER(),@ErrorSeverity = ERROR_SEVERITY(),@ErrorState = ERROR_STATE(),@ErrorLine = ERROR_LINE()
        RAISERROR (@ErrorMessage, @ErrorSeverity, @ErrorState, @ErrorNumber,@ErrorLine)
    END

END CATCH
GO

CREATE TRIGGER trigger_TestFees_instead_delete
ON TestFees
INSTEAD OF DELETE
AS
BEGIN TRY
    --cause failure on deletes when the invoice is not editable
    DELETE f
    FROM TestFees                f
        INNER JOIN DELETED      ff ON f.FeeID=ff.FeeID
        INNER JOIN testInvoices  i ON f.InvoiceID=i.invoiceID AND 1=CONVERT(int,i.Editable)/i.Editable --div by zero when invoice is not editable
END TRY
BEGIN CATCH

    IF ERROR_NUMBER()=8134 --catch div by zero error
        RAISERROR('Invoice is not editable, no changes permitted',16,1)
    ELSE
    BEGIN
        DECLARE @ErrorMessage nvarchar(400), @ErrorNumber int, @ErrorSeverity int, @ErrorState int, @ErrorLine int
        SELECT @ErrorMessage = N'Error %d, Line %d, Message: '+ERROR_MESSAGE(),@ErrorNumber = ERROR_NUMBER(),@ErrorSeverity = ERROR_SEVERITY(),@ErrorState = ERROR_STATE(),@ErrorLine = ERROR_LINE()
        RAISERROR (@ErrorMessage, @ErrorSeverity, @ErrorState, @ErrorNumber,@ErrorLine)
    END

END CATCH
GO

here is a simple test script to test out the different combinations:

INSERT INTO testInvoices VALUES(default,default) --works
INSERT INTO testInvoices VALUES(default,default) --works
INSERT INTO testInvoices VALUES(default,default) --works

INSERT INTO TestFees (InvoiceID,Amount) VALUES (1,111)  --works
INSERT INTO TestFees (InvoiceID,Amount) VALUES (1,1111) --works
INSERT INTO TestFees (InvoiceID,Amount) VALUES (2,22)   --works
INSERT INTO TestFees (InvoiceID,Amount) VALUES (2,222)  --works
INSERT INTO TestFees (InvoiceID,Amount) VALUES (2,2222) --works

update testInvoices set Editable=0 where invoiceid=3 --works
INSERT INTO TestFees (InvoiceID,Amount) VALUES (3,333) --error<<<<<<<

UPDATE TestFees SET Amount=1 where feeID=1 --works
UPDATE testInvoices set Editable=0 where invoiceid=1 --works
UPDATE TestFees SET Amount=11111 where feeID=1 --error<<<<<<<
UPDATE testInvoices set Editable=1 where invoiceid=1 --error<<<<<<<

UPDATE testInvoices set Editable=0 where invoiceid=2 --works
DELETE TestFees WHERE invoiceid=2 --error<<<<<

DELETE FROM testInvoices where invoiceid=2 --error<<<<<

UPDATE testInvoices SET Editable='A' where invoiceid=1 --error<<<<<<< Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value 'A' to data type bit.
KM
Thanks for putting it all together; the test data was especially nice. The only thing I would oppose is the cryptic error this would throw back to the programmer (or worse, the end user). Getting a divide-by-zero error form the database could easily send an unaware junior programmer sniffing in all the wrong places for hours. It's less efficient, but I would probably just go with an explicit EXISTS check and something like RAISERROR 50003 'Cannot update a complete invoice.' Nice work, though.
WCWedin
@WCWedin, I've changed the triggers to issue the error _Invoice is not editable, no changes permitted_ when appropriate. As you mention, I think my method adds less overhead than using _EXISTS_. You really want to add as little overhead in triggers as possible. Also note, these trigers are _INSTEAD OF_, so there are fewer hits to the table than an _AFTER_ and no _EXISTS_ checks either.
KM
Nicely done. I like the error-fu you pulled there. Kind of tangential, but what's the scope on the ERROR_X() functions? It would be kind of cool to have a "Rethrow" sproc. It does, unfortunately, throw away the message id when it re-raises the original error, which could complicate error-handling in the application.
WCWedin
the ERROR_..() functions only work in a BEGIN-END CATCH block. I have edited the code making the triggers throw the complete proper error when not an "editable" error. I added a new test case to test these new complete errors, and it shows does rethrow the complete message: _Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value 'A' to data type bit._
KM
It turns out you can push the rethrow off to a sproc: http://msdn.microsoft.com/en-us/library/ms179296.aspx. I would argue that if errors are rare (they really should be caught before deployment), it's better to take an occasional performance hit than to permanently pollute the application code with really complex error handling. I think I'll run some performance tests on the TRY/CATCH and EXISTS methods. After all, if the straight-forward approach turns out to be too slow as the number of rows increases, it's pretty simple to throw an ALTER TRIGGER statement into a new migration. Thanks!
WCWedin
not all errors are bugs to catch in development. multiple people working on overlapping data can cause constraint errors, that were designed to happen and are surely not bugs. There is never enough time to go back and speed up every piece of slow code, so just do it right the first time: do it as fast as possible and with as as little overhead as possible (forget the EXISTS).
KM
Good point about the constraint errors.I disagree with your notion of "doing it right," though. I think as fast as possible is almost never the right way to do the first time around -- some people would see that as premature optimization. You can never guess with execution paths are truly performance-critical beforehand, and if the optimization ends up spewing all over other parts of the program, that's a problem. As written, this approach will cover up real divide-by-zero errors, require code to parse error messages in special ways in special cases, and possibly truncate the error message.
WCWedin
Continuted: These are really just ideological and academic points, though. Not that I'm not enjoying the discussion – I just want to make it clear that I don't mean anything personal by it! I appreciate your help and all the effort you've put into you answer.
WCWedin
@WCWedin said _I think as fast as possible is almost never the right way to do the first time around_ you must refactor a lot, or don't truly mean it. Do you wait until you the query is slow to add an index or do you know it will need it and add it when you create the table? If you eliminate the need to do an extra sql commands (insert or update or delete + EXISTS check vs just insert or update or delete) is hardly premature optimization, it is just smart.
KM
I _do_ refactor a lot, as it happens. But really, the issue is that there's a cost difference. Unexpected behavior in a trigger will have future developers cursing your name, and it costs a lot to write code that can handle it. My feeling is that triggers should be as transparent and intuitive as possible, or someone will pay the price in frustration later. In contrast, indexes are nearly free to create, and they rarely have hidden costs. (And when they do have hidden costs, it's usually painless to drop them.) "Premature optimization" is a relative term.
WCWedin
would you rather have an "intuitive" cursor loop where each row is processed, and every developer understands what is going on or a complex single query command to process the set. I always go for the single step. you code to the language, developers should increase their skills, don't code for newbies.
KM
Of course not. That's a complete strawman. I'm not talking about readability versus efficiency (though that is an important consideration). What I'm saying is that if two different methods produce difference results, you pick the one that produces the correct results. From your pattern of participation on Stack Overflow, I gather you're more of a DBA than an application programmer. I have to fill both niches where I work, and I've learned that it's important to strike a balance; if you want the programmers to be kind to your database, it should be kind to them in turn.
WCWedin
Alternatively stated, if it's fast _enough_, leave it. If it's not, change it.
WCWedin
+1  A: 

I agree with the general consensus that a lock bit should be added to the Invoices table to indicate whether fees may be added. It is then necessary to add TSQL code to enforce the business rules related to locked invoices. Your original post doesn't seem to include specifics about the conditions under which an invoice becomes locked, but it is reasonable to assume that the lock bit can be set appropriately (this aspect of the problem could become complicated, but let's settle that in another thread).

Given this consensus, there are 2 implementation choices that will effectively enforce the business rule in the data tier: triggers and standard stored procedures. To use standard stored procedures, one would of course DENY UPDATES, DELETES, AND INSERTS for the Invoices and Fees tables and require that all data modification be done using stored procedures.

The advantage of using triggers is that application client code can be simplified because the tables can be accessed directly. This could be an important advantage if you are using LINQ to SQL, for example.

I can see a couple of advantages to using stored procedures. For one thing, I think using a stored procedure layer is more straightforward and hence more understandable to maintenance programmers. They, or you several years from now, may not remember that clever trigger you created, but a stored procedure layer is unmistakable. On a related point, I would argue that there is a danger of accidentally dropping a trigger; it is less likely that someone would accidentally change the permissions on these tables to make them directly writable. While either scenario is possible, if there's a lot riding on this I would opt for the stored procedure option for safety's sake.

It should be noted that this discussion is not database agnostic: we're discussing SQL Server implementation options. We could use a similar approach with Oracle or any other server that provides procedural support for SQL, but this business rule can't be enforced using static constraints, nor can it be enforced in a database neutral manner.

Paul Keister
That's a fair point about the DB agnostic tag. At the time, I was hoping that the right transformation could flatten the temporal dimension of the business rule, making it amenable to static constraints. Regarding triggers versus stored procedures, we are using LINQ in our data layer, so triggers are the natural choice. Though it's also pretty easy to get stored procedures working in an intuitive way with LINQ, as well.
WCWedin