views:

52

answers:

2

At work someone said "when we design and optimize the stored procedures, we have to take into account that fact that they will be running on large database servers".

I'm confused by that statement in a number of respects:

  1. Does "large database server" imply a large volume of data and, if so, how does that impact the design of the stored procedure?

  2. Is the design and optimization of a stored procedure the same as the design and optimization of regular SQL?

  3. Is "design and optimize" redundant? In other words, if you design a stored procedure well, wouldn't it automatically be optimized for performance?

+3  A: 
  1. Typically, yes. When you're working with databases, if you're working with large amounts of data, things that would be fast on small datasets may well be quite slow on large ones.

  2. No, because a stored procedure is a normal programming language that happens to include SQL.

  3. Not necessarily. You can have two separate designs, one the is easy to code, easy to maintain, etc., but also slow for large volumes of data. And another perhaps has a less clean, less simple design, but adds complexity for the sake of performance.

Will Hartung
Nicely explained.OP - we do development on databases containing tens of thousands of rows. Our client production databases contain up to 800 *million* rows in key tables. There's a whole different set of needs there, so we spend a lot of time looking at runtime efficiencies.
DaveE
+1  A: 
  1. Large DB servers typically differ from small ones in several ways: more data, more CPUs, more RAM, and bigger and faster disks or a SAN. Some queries run differently in that environment than on small machines. For example, complex joins against large tables might run reasonably fast there, and be prohibitively slow on a smaller machine. There are also caching and memory management approaches that make sense on large machines that aren't nearly as useful on smaller ones.

  2. Not entirely. For example, when you're working on a stored procedure, you are also taking batch boundaries and possible multiple result sets into account. SPs can also have security-related issues that don't exist with dynamic SQL or parameterized queries.

  3. No. Design means to build something that works correctly and meets business requirements. Optimize relates to speed or scalability or both. An SP can be slow, but still do what it's supposed to do. The usual best practice (though one I don't always agree with) is to get it working correctly first, then optimize if it turns out to be necessary.

RickNZ
The big problem with most projects that use "best practice" (point 3) is that they almost always fail to allow time for the optimization steps, then it comes back to bite when the server is starting to choke.
devstuff
@devstuff: I agree, and would add that it's often easier to build things right the first time (at least to a first approximation), and that if you don't, sometimes it can result in the project collapsing under its own weight.
RickNZ