views:

305

answers:

8

I'm writing a second interface to a database application to get around some shortcomings of the original interface. Unfortunately, if I create new records, expected audit trail records will not be generated. It sure would have been nice if the database design had worked such details into table triggers, or else provided a stored procedure API for such operations as inserting new records into these tables.

Should applications be designed this way, in general? How much of a database application should be in stored procedures?

+1  A: 

IMHO you should create SPs for any logic that is dedicated to handling data and can serve as a good abstraction between the app and the database. A good example of this is in-app hashing of passwords. I always create an SP that creates the record and hashes the password for me....

because the hashed password is really the databases domain. I also create an SP to check if a password string is the same hash (yes this is transmitted to the database, but I trust my database and network).

Forgetting the weak-ish example: when you can separate concerns without damaging the design and integrity of your system... make the abstraction and do it!

The financial sector uses alot of SPs to create an effective API on-top of the database for handling the down-and-dirty financial operations, because they can build-in auditing outside the application in the stored procedures.

In a typical design, what should be RDBMS-side and app-side can be quite obvious when you think ... 'Does the application need to do this with the data, does it care??'

Aiden Bell
+5  A: 

Our teams have a rule - any data going into or out of a database must go through a stored procedure. We have even gone so far as to build restrictions into our data access components to enforce this.

As for other things such as book keeping, audit trails, etc, we put them in stored procedures as well. Triggers are convenient, but we found that for the purposes of audit logs involving the who, what, and when of an update, not all the data we want is redily available to the trigger. The only time we've ever used triggers is to keep a full change history of every record in a table, but even then, in hindsight, it caused maintenance problems and a sproc would have been better for us.

James Conigliaro
+3  A: 

As a general rule, I prefer to keep as much of the data-twiddling in SPs; a semi-generic data-access layer will handle firing off the requests, and presenting the data to the interface.

The old wisdom of "increased performance" with SPs is really no longer relevant*, but I've found maintenance to be much simpler with this method.

However ... I avoid triggers like the plague. They're difficult to maintain, nearly impossible to debug, and lead to all kinds of unintended side effects.


*The common perception is that Stored Procedures perform better than ad-hoc sql queries. However, this is false:

SQL Server 2000 and SQL Server version 7.0 incorporate a number of changes to statement processing that extend many of the performance benefits of stored procedures to all SQL statements. SQL Server 2000 and SQL Server 7.0 do not save a partially compiled plan for stored procedures when they are created. A stored procedure is compiled at execution time, like any other Transact-SQL statement. SQL Server 2000 and SQL Server 7.0 retain execution plans for all SQL statements in the procedure cache, not just stored procedure execution plans.

-- SqlServer's Books Online

Adrien
Can you point to a good article or otherwise clarify why/how the performance benefit of SPs is no longer relevant?
steamer25
Agree totally about triggers, except for audit purposes where they ideal
Cruachan
@Adrien, I modified your question to address steamer25's question. Feel free to roll back if need be.
Gavin Miller
No worries; that's exactly the quote I was planning to put up once I was done doing that "work" thing I'm supposed to be doing between 0800 and 1700 ... :) Although, I am changing your wording slightly, because I think you are addressing the perception that SP's perform better than Ad-hoc, and not vice versa.
Adrien
Oh, and thanks! :)
Adrien
+1  A: 

Different philosophies have different answers, but here is mine

  1. Data intensive calculations -- in particular, anything that is very expensive to do outside of the database.

  2. Shared code. My last job that used a relational database was connecting to a library of database functions. The Java code knew nothing of this connection. To avoid duplication any function needed by both the code talking to the library and used by Java were stored procedures.

The other concern is making certain that difficult or critical code be under test. This does not mean it can't be a stored procedure, just that it may be easier at least at first to write it where testing is easier.

Obviously if the function needs to talk to external services (that the database does not have easy access to), you might not want to make that a stored procedure.

Kathy Van Stone
+1  A: 

Here's the other point of view: None.

A database is for persistent data.

Your application is for processing. All logic, all code, all decision-making, all calculations, all referential integrity, everything should be in your application.

It makes life very simple if you have a simple database that has just the data.

In the long run, many people find that their stored procedures and triggers have created a tangled mess that is hard (or impossible) to maintain. This is easy to prevent by using the database as a fast, large, easy-to-backup repository for data. Indexes and views and a proper part of the database.

Everything else doesn't belong there.

Folks lift up "issues" like the following.

  1. The database is "central". Somehow an application server isn't central. I don't get it. If you want centralized processing, use an application server. If you want a lot of complex, centralized processing, use an ESB.

  2. Some processing is "closer to the database." There are many bad examples. The good examples often has to do with surrogate key assignment. I prefer to use an ORM layer (iBatis, or SQLAlchemy or something) to do this. You get the same effect without writing or maintaining a stored procedure.

  3. Some processing requires auditing. Actually, all processing requires auditing. Use Business Domain objects with an ORM and do your audit processing in your application.

    The RDBMS-based built-in auditing is good. Any auditing you write with stored procedures is not going to be as good. Solution -- use logging in your application, and the built-in audit features of the database engine. Avoid triggers and SP's for this.

  4. Stored procedures are faster. This is untrue. Good transaction design makes things faster. Stored procedures impose a number of constraints on your architecture, which sometimes forces you to create small, tidy transactions. Try writing the same small, focused transaction in Java and you'll find the PL/SQL isn't actually faster.

You can live without stored procedures. In the long run, you'll find that the design discipline of focused transactions has more value than the procedure itself.

Example

"if I create new records, expected audit trail records will not be generated"

If you had a proper Business Domain class definition to create your records, it would also create audit records for you in your application.

S.Lott
I really agree with you, this is exactly my point of view... but I can also understand how some people will see your answer as a sin ^^
marcgg
+1 - I think either point of view is more dogma than logic, but I agree with this dogma
DanSingerman
All good points. As always, the answer is "there is no 'right' answer", other than "use the right tool for the job", "be consistent", and "do what makes the most sense for the given problem".
Adrien
if you ever have to stand up in court and defend your audit trail when it's built into an application layer above the database then you are going to get hung out to dry. To be useful there has to be no possibility of even a chink of light between the data and the audit trail.
Cruachan
@Cruachan: Correct. Web logs, for example, come from the application and stand up to constant scrutiny. Indeed, web log generation -- in the application -- is a better model of audit logging than triggers which "should" happen in the database. But as this question notes, don't always happen correctly.
S.Lott
Not only applications add or change data in your database, to put auditing anywhere except the database is short-sighted and will as Cruachan said, cause you problems if you ever need to defend your process in court.
HLGEM
@HLGEM: Please explain that to apache.org then, because their web server doesn't satisfy your level of auditing. I believe that database triggers are not legally mandated anywhere in the US.
S.Lott
The application logging aspect of web server logs only stand up in court because there are easier ways to falsify them after they are written to disk which will be suspect long before anyone worries about whether they are complete or not due to a webserver malfunction.
Cade Roux
This is your problem -> "It makes life very simple if you have a simple database that has just the data" because the point of application development is *not* to optimise your coding but to serve the needs of the business. If the business is better served by encapsulating logic at the database level then tough, your coders can do it that way and take a running jump if they don't like it. OTOH I actually agree with you, I *prefer* to keep a clean database and do it your way if possible, but my preferences are less important than the customer's needs.
Cruachan
@Cruachan: Your customers may care about where their logic is implemented. Mine don't usually care about "database" vs. "application". What they want is application logic that is right, and adapts as the business changes. I don't find that stored procedures facilitate getting logic right and adaptable. I find that application code is easier to get right and adapt.
S.Lott
+4  A: 

This depends entirely on your environment. The answer to the question really isn't a coding problem, or even an analysis issue, but a business decision.

If your database supports just one application, and is reasonably tightly integrated with it, then it's better, for reasons of flexibility, to place your logic inside your application program. Under these circumstances handling the database simply as a plain data repository using common functionality looses you little and gains flexibility - with vendors, implementation, deployment and much else - and many of the purist arguments that the 'databases are for data' crowd make are demonstratively true.

On the other hand if your are handling a corporate database, which can generally be identified by having multiple access paths too it, then it is highly advisable to screw down the security as far as you can. At the very least all appropriate constraints should enabled, and if possible access to the data should be through views and procedures only. Whining programmers should be ignored in these cases as...

  1. With a corporate database the asset is valuable and invalid data or actions can have business-threatening consequences. Your primary concern is safeguarding the business, not how convenient access is for your coders.
  2. Such databases are by definition accessed by more than one application. You need to use the abstraction that stored procedures offer so the database can be changed when application A is upgraded and you don't have the resource to upgrade application B.
  3. Similarly the encapsulation of business logic in SPs rather than in application code allows changes to such logic to be implemented across the business more easily and reliably than if such logic is embedded in application code. For example if a tax calculation changes it's less work, and more robust, if the calculation has to be changed in one SP than multiple applications. The rule of thumb here is that the business rule should be implemented at the closest point to the data where it is unique - so if you have a specialist application then the logic for that app can be implemented in that app, but logic more widely applicable to the business should be implemented in SPs.

In your case you now have multiple applications accessing the same database so you should be moving logic and audit functions from the application level into the database. Myself I would usually put audit functionality into the database itself anyway because if I am ever then called on to investigate some fraud issue (and it has happened more than once) I simply feel I can stand up and be certain of my findings with considerably more confidence than if it's in the application layer - there's simply less opportunities for caveats that way.

Coders who dive into religious wars over the use or not of SPs generally have worked in only one environment or the other so they extrapolate their limited experience into a cast-iron position - which indeed will be perfectly defensible and correct in the context from which they come but misses the big picture. As always, you should make you decision on the needs of the business/customers/users and not on the which type of coding methodology you prefer.

Cruachan
Good answer and I voted it up, but I'd enjoy hearing more from you (possibly in a second answer) about how for a database with one application putting the logic in the application instead of SPs buys flexibility.
skiphoppy
Generally if I'm coding a stand-alone application (desktop or web) that is only ever going to be accessed through the the one interface then I prefer to keep my database calls as generic as possible and encapsulate my logic inside the application code. This has the advantages that (a) I am not tied to a specific vendor, (b) change control is easier as all code is in the same place, (c) debugging and development are easier for similar reasons, (d) I am less vulnerable to permissions issues...
Cruachan
These are all considerable advantages that do swing it for me towards the 'databases are for data' argument where possible, BUT all of these pale into insignificance when handling business-critical corporate data.
Cruachan
I also come from a generation where grep'ing a million line COBOL program library was routine before applying a business rule change - sure we had libraries but with 20 years of code development you really could never be sure. Chasing OC7 errors with management breathing down your neck really teaches you the benefits of abstraction and keeping the logic as close to the data as can be arranged.
Cruachan
+2  A: 

Everything should be in its appropriate place.

The database has to protect its perimeter. It cannot assume that the only people allowed to connect to it is a trusted and tested application.

Sooner or later someone is going to connect to it with a buggy application or a buggy client library (Oracle has some clients which allow users to create invalid dates). Or they are going to connect with a report writer or Access or Excel.

My preference is not to allow direct table access, but to only allow access to views, SPs etc., so role-based permission is managed there and an interface contract is formed very similarly to the way you define interfaces for your classes or services at the application layer. This does not mean that all/much application logic is in stored procedures, but it does mean that the database enforces certain rules - usually basic ones like referential integrity - so that the application CAN make some assumptions about how the database was used in the past by previous versions of itself safely.

There was a great panel discussion of this at Tech Ed 2009 which you can watch here.

Cade Roux
+3  A: 

Fraud is much easier to commit in a database that has dynamic sql coming from an application because permissions are set at the table level. Most fraud is committed by current employees who often have easy access to the server. Also disgruntled employees like to do things to destroy data as well. Exactly two people should be able to access the tables in your production datbase if any financial data is in it, a dba and one alternate dba. Everyone else should have only the tasks which are described by using the stored procs and nothing else. Auditing must be at the datbase level or it is essentially worthless. If I audit at the application level in an application that does not use stored procs (these two usually run together), it is easy for someone to directly access the tables, change things and then be untraceable. Audit logs are to catch the changes to the data, that can only be done fully at the point where the data changes.

If the data manipulation is complex, stored procs offer the best method for performance tuning as well.

Another reason to use procs is that it is far easier to refactor the database when needed if all the calls to a table are encapsulated in procs. It is also far easier to upload small changes to prod.

HLGEM