A general question, without a specific case in mind - is it usually preferred to use MySQL stored procedures over writing a PHP script that performs the same calculations and queries?
What are the benefits of each method?
A general question, without a specific case in mind - is it usually preferred to use MySQL stored procedures over writing a PHP script that performs the same calculations and queries?
What are the benefits of each method?
Whereever possible, the end-user will benefit from the abstraction of the data from the UI. Therefore, you should try and leverage stored procedures as much as possible.
Stored procedure 99 times out of 100. If I were topick 1 reason then it would be that if your php web app does all database acces via stored procedures and your the database user only has permision toexecute said stored procedures then you are 100% protected against SQL injection atacks.
You don't necessarily need the underlying values if the calculations are performed on the database, then let the database do them. This helps keep the volume of data transfer between database an PHP script to a minimum; but generally calculations with database data are best performed by the database itself.
i've heared ppl say "let the DB do as much as it can" and others cried like "wtf, what are you doing to my DB performance"
so i guess it should mostly be a decision of usage rate (stored procedures will stress the mysql process and php code will stress the webserver process)
For me, the advantage of keeping anything to do with the database within the database is debugging. If you have your calculations (at least most of them) done within the stored procedure, and you need to make a change, then you just modify it, test it, save it. There would be no changes to your PHP code.
If you're storing major calculations within your PHP code, you need to take the SQL statements from the code, clean it up, then modify it, test it and then copy it back in and test it again.
Ease of maintenance comes to mind with keeping things separate. The code look cleaner, and be easier to read if you use stored procedures, because we all know that come SQL scripts just get to be ridiculously large. Keep all that database logic in the database.
If the database is properly tuned, you'll probably have slightly quicker times for execution of the query, because rather than having PHP parse the string, then send it to the database, then the database executes it and sends it back, you can just push parameters into the database with the stored procedure, it will have a cached execution plan for the stored procedure, and things will be slightly quicker. A few carefully placed indexes can help speed up any data retrieval because really - the web server is just a conduit, and PHP scripts don't load it up that much.
I would say "don't make too much magic with the database". The worst case would be for a new developper on the project to notice that ** an operation ** is done, but he cannot see where in the code it is. So he keeps looking for it. But it's done in the database.
So if you do some "invisible" database operations (I'm thinking about triggers), just write it in some code documentation.
// add a new user
$user = new User("john", "doe");
$user->save();
// The id is computed by the database see MYPROC_ID_COMPUTATION
print $user->getId();
In the other hand, writing functions for the DB is a good idea, and would provide the developer a good abstraction layer.
// Computes an ID for the given user
DB->execute("SELECT COMPUTE_ID(" . $user->getLogin() . ") FROM DUAL");
Of course this is all pseudo-code, but I hope you understand my obscure idea.
I think Jeff Atwood hit the nail on the head in 2004 regarding stored procs:
Who Needs Stored Procedures, Anyways?
Having used both stored procedures and dynamic SQL extensively I definitely prefer the latter: easier to manage, better encapsulation, no BL in the data access layer, greater flexibility and much more. Virtually every major open-source PHP project uses dynamic SQL over stored procs (see: Drupal, Wordpress, Magento and many more).
This conversation almost seems archaic: get yourself a good ORM, stop fretting over your data access and start building awesome applications.
Well, there's a side of this argument that I very rarely hear, so I'll write it here...
Code is version controlled. Databases are not. So if you have more than one instance of your code, you'll need some way of performing migrations automagically upon update or you'll risk breaking things. And even with that, you still face the problems of "forgetting" to add an updated SP to the migration script, and then breaking a build (potentially without even realizing it if you aren't testing REALLY idepth).
From debugging and maintenance, I find SP's 100x as hard to dissect as raw SQL. The reason is that it requires at least three steps. First, look in PHP code to see what code is called. Then go into database and find that procedure. Then finally look at the procedure's code.
Another argument (along the lines of version control), is there's no svn st
command for the SP's. So if you get a developer who manually modifies a SP, you're going to have a hell of a time figuring that out (assuming they are not all managed by a single DBA).
Where SP's really shine is when you have multiple applications talking to the same database schema. Then, you only have one place where DDL and DML is stored, and both applications can share it without having to add a cross dependency in one or more libraries.
So, in short, my view is as follows:
Use Stored Procedures:
Use raw SQL/ORM/Generated SQL just about in any other case (Just about, since there are bound to be edge cases I am not thinking about)...
Again, that's just my $0.02...
For us using stored procedures is absolutely critical. We have a fairly large .net app. To redeploy the entire app can take our users offline for a brief period which simply is not allowed.
However, while the application is running we sometimes have to make minor corrections to our queries. Simple things like adding or removing a NOLOCK, or maybe even change the joins involved. It's almost always for performance reasons. Just today we had a bug caused by an extraneous NOLOCK. 2 minutes to locate the problem, determine solution, and deploy new proc: zero downtime. To do so with queries in code would have caused at least a minor outage potentially pissing off a lot of people.
Another reason is security. With proc's we pass the user id (non-sequential, non-guessable) into each proc call. We validate the user has access to run that function in the web app, and again inside the database itself. This radically raises the barrier for hackers if our web app was compromised. Not only couldn't they run any sql they want, but even to run a proc they would have to have a particular authorization key.. Which would be difficult to acquire. (and no that's not our only defense)
We have our proc's under source control, so that isn't an issue. Also, I don't have to worry about how I name things (certain ORM's hate certain naming schemes) and I don't have to worry about in flight performance. You have to know more than just SQL to properly tune an ORM.. You have to know the ORM's particular behaviors.