views:

371

answers:

9

I'm working on a .NET component that gets a set of data from the database, performs some business logic on that set of data, and then updates single records in the database via a stored procedure that looks something like spUpdateOrderDetailDiscountedItem.

For small sets of data, this isn't a problem, but when I had a very large set of data that required an iteration of 368 stored proc calls to update the records in the database, I realized I had a problem. A senior dev looked at my stored proc code and said it looked fine, but now I'd like to explore a better method for sending "batch" data to the database.

What options do I have for updating the database in batch? Is this possible with stored procs? What other options do I have?

I won't have the option of installing a full-fledged ORM, but any advice is appreciated.


Additional Background Info:

Our current data access model was built 5 years ago and all calls to the db currently get executed via modular/static functions with names like ExecQuery and GetDataTable. I'm not certain that I'm required to stay within that model, but I'd have to provide a very good justification for going outside of our current DAL to get to the DB.

Also worth noting, I'm fairly new when it comes to CRUD operations and the database. I much prefer to play/work in the .NET side of code, but the data has to be stored somewhere, right?


Stored Proc contents:

ALTER PROCEDURE [dbo].[spUpdateOrderDetailDiscountedItem] 
    -- Add the parameters for the stored procedure here
    @OrderDetailID decimal = 0,
    @Discount money = 0,
    @ExtPrice money = 0,
    @LineDiscountTypeID int = 0,
    @OrdersID decimal = 0,
    @QuantityDiscounted money = 0,
    @UpdateOrderHeader int = 0,
    @PromoCode varchar(6) = '',
    @TotalDiscount money = 0

AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Insert statements for procedure here
    Update OrderDetail
    Set Discount = @Discount, ExtPrice = @ExtPrice, LineDiscountTypeID = @LineDiscountTypeID, LineDiscountPercent = @QuantityDiscounted
    From OrderDetail with (nolock) 
    Where OrderDetailID = @OrderDetailID

    if @UpdateOrderHeader = -1
      Begin
        --This code should get code the last time this query is executed, but only then.
        exec spUpdateOrdersHeaderForSkuGroupSourceCode @OrdersID, 7, 0, @PromoCode, @TotalDiscount
      End
+3  A: 

You can send the full set of data as XML input to the stored procedure. Then you can perform Set operations to modify the database. Set based will beat RBARs on performance almost every single time.

Raj More
@Raj Thanks for the answer. What's an RBAR?
Ben McCormack
@Ben: Row By Agonizing Row.
Ken Redler
Ha. Yes, perfectly sums up the situation.
Ben McCormack
Blech. WTF? Serialize it to XML, then deserialize it, when going over a connection that doesn't require it and won't benefit? This makes absolutely no sense.
David Lively
+8  A: 

An easy and alternative way I've seen in use is to build a SQL statement consisting of sql_execs calling the sproc with the parameters in the string. Not sure if this is advised or not, but from the .NET perspective, you are only populating one SqlCommand and calling ExecuteNonQuery once...

Note if you choose this then please, please use the StringBuilder! :-)

Update: I much prefer Chris Lively's answer, didn't know about table-valued parameters until now... unfortunately the OP is using 2005.

Adam
This is probably the least invasive solution. +1
Chris
@Chris I saw it used to batch multiple database commands relatively cleanly from code - you still get the benefits of sprocs, but you need a little extra logic to encode the parameters into the string in a valid manner. Not my favourite route, but understandable.
Adam
dont think you need to worry about parameterised queries if all you are doing is calling procs as these have their own query plan anyway?
Noel Kennedy
@Noel sorry, don't follow. The sprocs will be optimised yes, the OP was just curious how to send 368 commands without foreach-ing and refreshing the parameters of the command object.
Adam
sorry missread, I thought you were saying that a parameterised query would optimise it.
Noel Kennedy
@Adam This solution ended up working perfectly only took about 5 minutes to make the change. I ended up building a `List<string>` used with `String.Join(";", _myList)` to make the call and it went from taking 5 whole seconds to less than a second. This is exactly the quick fix we needed, so much thanks!
Ben McCormack
+1 Excellent pragmatic solution. But, `StringBuilder`, really? Is the time to create 368 strings really significant compared with 368 stored procedure calls?
MarkJ
It would be substantially more than 368 strings if multiple concatenations occured in succession - 368 is assuming only one giant join :-) Yes it could be called a micro-optimization, but that's only if you do it the wrong way the first time, using the StringBuilder takes no longer to code and is more memory efficient in exactly this scenario - no brainer really even compared to the relative performance gains on high latency things such as db/network access.
Adam
This is hideous.
David Lively
@David possibly - but it gets the job done and keeps the sproc intact.
Adam
A: 

Can you create batched statement with 368 calls to your proc, then at least you will not have 368 round trips. ie pseudo code

var lotsOfCommands = "spUpdateOrderDetailDiscountedItem 1; spUpdateOrderDetailDiscountedItem 2;spUpdateOrderDetailDiscountedItem ... 368'

var new sqlcommand(lotsOfCommands)
command.CommandType = CommandType.Text;

//execute command
Noel Kennedy
This doesn't solve anything, and relies on your code to escape values, etc correctly, which is bad. Assuming a persistent database connection, the time it takes to send the command should be minimal. While you're technically combining all updates into one batch by putting them in a single SQL statement, you're not solving the problem of calling a sproc 368 times (which your concatenated SQL still does!). Also, this removes your ability to track errors on a per-update basis.
David Lively
Well, I'm sorry but do you really think that reducing 368 round trips to one won't increase performance? It is also the least invasive change. I appreciate you will need to escape values. Yes, the whole batch will error rather than one proc execution, depending on the semantics of whats being done, you might want to rollback the whole lot, which is still possible with batching. I've not used SQLCLR, so can't comment on it, from what I have read about it, this scenario could be a reasonable use for it.
Noel Kennedy
+14  A: 

If you are using SQL 2008, then you can use a table-valued parameter to push all of the updates in one s'proc call.

update Incidentally, we are using this in combination with the merge statement. That way sql server takes care of figuring out if we are inserting new records or updating existing ones. This mechanism is used at several major locations in our web app and handles hundreds of changes at a time. During regular load we will see this proc get called around 50 times a second and it is MUCH faster than any other way we've found... and certainly a LOT cheaper than buying bigger DB servers.

Chris Lively
Supporting link: http://msdn.microsoft.com/en-us/library/bb675163.aspx
Adam
@Chris I hadn't heard of those before. Thanks for the suggestion. Could you show or link to an example of how you might use that in code both on the client and in the stored proc definition?
Ben McCormack
@Adam you beat me to it! Thanks!
Ben McCormack
I justed checked, and unfortunately we're using SQL Server 2005.
Ben McCormack
And the best part is it handles the single record insert flawlessly as well as the multi-record insert. All insert/update/delete procs built for 2008 should consider this method.
HLGEM
A: 

I had issues when trying to the same thing (via inserts, updates, whatever). While using an OleDbCommand with parameters, it took a bunch of time to constantly re-create the object and parameters each time I called it. So, I made a property on my object for handling such call and also added the appropriate "parameters" to the function. Then, when I needed to actually call/execute it, I would loop through each parameter in the object, set it to whatever I needed it to be, then execute it. This created SIGNIFICANT performance improvement... Such pseudo-code of my operation:

protected OleDbCommand oSQLInsert = new OleDbCommand();
// the "?" are place-holders for parameters... can be named parameters, 
// just for visual purposes 
oSQLInsert.CommandText = "insert into MyTable ( fld1, fld2, fld3 ) values ( ?, ?, ? )";
// Now, add the parameters
OleDbParameter NewParm = new OleDbParameter("parmFld1", 0);
oSQLInsert.Parameters.Add( NewParm );

NewParm = new OleDbParameter("parmFld2", "something" );
oSQLInsert.Parameters.Add( NewParm );

NewParm = new OleDbParameter("parmFld3", 0);
oSQLInsert.Parameters.Add( NewParm );

Now, the SQL command, and place-holders for the call are all ready to go... Then, when I'm ready to actuall call it, I would do something like..

oSQLInsert.Parameters[0].Value = 123;
oSQLInsert.Parameters[1].Value = "New Value";
oSQLInsert.Parameters[2].Value = 3;

Then, just execute it. The repetition of 100's of calls could be killed by time by creating your commands over and over...

good luck.

DRapp
I'd run this through a profile to make sure that the constructor and collection are really what's eating the time.
David Lively
A: 

Is this a one-time action (like "just import those 368 new customers once") or do you regularly have to do 368 sproc calls?

If it's a one-time action, just go with the 368 calls.
(if the sproc does much more than just updates and is likely to drag down the performance, run it in the evening or at night or whenever no one's working).

IMO, premature optimization of database calls for one-time actions is not worth the time you spend with it.

haarrrgh
A: 

Bulk CSV Import

(1) Build data output via string builder as CSV then do a Bulk CSV import:

http://msdn.microsoft.com/en-us/library/ms188365.aspx

Darknight
Manually saving your data to CSV and then loading the CSV is ugly. Use `SqlBulkCopy` for this purpose.
Brian
+1  A: 

If you are using a version of SQL Server prior to 2008, you can move your code entirely into the stored procedure itself.

There are good and "bad" things about this.
Good

  • No need to pull the data across a network wire.
  • Faster if your logic is set based
  • Scales up

Bad

  • If you have rules against any logic in the database, this would break your design.
  • If the logic cannot be set based then you might end up with a different set of performance problems
  • If you have outside dependencies, this might increase difficulty.

Without details on exactly what operations you are performing on the data it's hard to give a solid recommendation.

UPDATE
Ben asked what I meant in one of my comments about the CLR and SQL Server. Read Using CLR Integration in SQL Server 2005. The basic idea is that you can write .Net code to do your data manipulation and have that code live inside the SQL server itself. This saves you from having to read all of the data across the network and send updates back that way.

The code is callable by your existing proc's and gives you the entire power of .net so that you don't have to do things like cursors. The sql will stay set based while the .net code can perform operations on individual records.

Incidentally, this is how things like heirarchyid were implemented in SQL 2008.

The only real downside is that some DBA's don't like to introduce developer code like this into the database server. So depending on your environment, this may not be an option. However, if it is, then it is a very powerful way to take care of your problem while leaving the data and processing within your database server.

Chris Lively
@Chris you bring up a good point about possibly moving the code entirely to the stored proc. The problem is that the "auto-discounting" logic is a little complicated and would almost certainly require using cursors in the stored proc. That seems harder to maintain to me from a code perspective.
Ben McCormack
@Ben McCormack: In that case you still have one more option available under certain versions of SQL server: the CLR. But now we're treading into land that most people stay away from... with good reason. ;)
Chris Lively
@Chris what do you man when you say that the CLR is an option when working with SQL Server? Obviously, the CLR *is* the engine that runs .NET managed code, but what does that have to do with SQL Server?
Ben McCormack
@Ben McCormack: Updated answer to include detail on integrating .net and sql server.
Chris Lively
@Chris That does sound incredibly powerful. Thanks for the explanation.
Ben McCormack
@Ben Another nice point about SQL CLR assemblies: they get serialized into the database, so you don't have to copy DLLs when moving your db from, say, dev to staging or production.
David Lively
A: 

Table-valued parameters would be best, but since you're on SQL 05, you can use the SqlBulkCopy class to insert batches of records. In my experience, this is very fast.

David Lively