views:

221

answers:

8

Hi, I have a question regarding the performance of SQL. I will illustrate my problem with pseudocode.

I am wondering which will preform better and by how much? Say for 10 items, on each page load. In .NET. Is is a lot faster? a little faster? A difference not noticable on SQL?

foreach(item in mylist) {
CallSQLStoredProc(item.id);
}

vs

int[] ids; // array of ids
CallSQLStoredProc(ids)  // stored procedure returns more than one row for each id
+11  A: 

The second option will certainly be faster because it is a single network round trip, as well as a single SP call.

Otávio Décio
A: 

On each page load, or the first time the page is loaded? I would not want to do that for every single postback in an ASP.NET page.

To answer your question more directly, if you're able to get multiple records by sending multiple IDs, do that. More efficient and more scalable should you ever need more than 10 items.

Srdjan Pejic
A: 

it all depends how the proc is coded, if you pass in 10 items in the 2nd proc and that proc then uses a cursor to get those rows then the first call might be faster

SQLMenace
+2  A: 

Definetly the second, varying from about 10x faster to a little faster.

If whatever you're doing with the id's can be done in a set operation, you'll get several times the performance gain than calling the SP individually.

I regularly have procs that look like:

create procedure proc ( @ids varchar(max) ) as
select * from users_tbl u
inner join spiltCSVs(@ids) c
  on c.id = u.id
--so on and so forth

That's a set based operation; as opposed to the procedural method using a cursor in the proc, or using the for loop to iterate over calling the procedure with an individual id.

Tom Ritter
which db do you use?I am looking for this function "spiltCSVs" in SQL Server.Google "spiltCSVs" returns me 1 result :-)
Peter Gfader
It's a utility function every writes themselves as needed.http://stackoverflow.com/questions/314824/t-sql-opposite-to-string-concatenation-how-to-split-string-into-multiple-recor
Tom Ritter
+1  A: 

Since this wouldn't fit in a comment for ocdecio's answer...

Just to expand on it... in most systems that I've seen the network traffic is the limiting factor for performance (assuming a reasonably tuned database and front-end code that isn't absolutely horrible). Even if your web server and database server are on the same machine, the interprocess communication can be a limiting factor if you have frequent calls back and forth between the two.

Tom H.
A: 

How much faster the second will be really depends on too many things. The network overhead might be insignificant compared to the size of your result sets.

There is another alternative (which should be faster than either depending on the locking behavior), which is to call all of them asynchronously - then your page can effectively complete when the longest one completes. Obviously, this will require some additional coding.

In this example, there is only one SP overhead. We'll assume the SP returns either a single rowset which the client will split/process or multiple rowsets:

int[] ids; // array of ids
CallSQLStoredProc(ids)  // stored procedure returns more than one row for each id

In this example, the SP call overheads are n times the single call. and the calls are serialized:

foreach(item in mylist) {
    CallSQLStoredProc(item.id);
}

In the third alternative:

foreach(item in mylist) {
    StartSQLStoredProc(item.id);
}

// Continue building the page until you reach a point where you absolutely have to have the data

wait();

This still has the n DB call overheads, but the performance improvement can depend on the capacity of the SQL Server and network in order to parallelize the workload. In addition you get the benefit of the ability to start the SQL Server working while the page is building.

The single SP solution can still win out, particularly if it can assemble a single result set with a UNION where the SQL Server can parallelize the task. However, if the result sets have separate schemas or the UNION cannot perform well, A multiple SP asynchronous solution can beat it out (and can also take advantage of the ability to do other work in the page).

Cade Roux
If it can be done asynchronously, then it can be written to be parallelized by the DB engine. In that case, the single call will still be faster.The single call will always be faster.
entaroadun
Unfortunately separate batches in SPs are not parallelized by the DB. The single call which has to build 10 rowsets and then return them is not guaranteed to be faster if you can stream 10 different large streams out of the database simultaneously. I will update the answer to demonstrate.
Cade Roux
A: 

Iterating anything is going to always cause more overhead. There aren't many situations where iteration improves performance.

My advice has always been to avoid 2 things in programming:

  1. if then else statments
  2. iteration

You will always have situations where you will use both, but the less you use of them the more potential your application has to run faster and smoother.

Syntax
A: 

IF you want scalability in your application, you will want to use caching as much as possible. You should be running any shared queries only once and storing the result in the cache.

As for your query, provided you aren't using cursors in the query for each ID, it should be faster provided that network latency has meaningful impact on what your doing. When in doubt, measure. I've been very surprised many times when I actually implemented timing on my functions to see how long different things took.

In .net System.Diagnostics.StopWatch is your friend :).

Spence