tags:

views:

779

answers:

2

Recently we turned a set of complicate C# based scheduling logic into SQL CLR stored procedure (running in SQL Server 2005). We believed that our code is a great SQL CLR candidate because:

  • The logic involves tons of data from sQL Server.
  • The logic is complicate and hard to be done using TSQL
  • There is no threading or sychronization or accessing resources from outside of the sandbox.

The result of our sp is pretty good so far. However, since the output of our logic is in form of several tables of data, we can't just return a single rowset as the result of the sp. Instead, in our code we have a lot of "INSERT INTO ...." statements in foreach loops in order to save each record from C# generic collection into SQL tables. During code review, someone raised concern about whether the inline SQL INSERT approach within the SQL CLR can cause perforamnce problem, and wonder if there's other better way to dump data out (from our C# generic collections).

So, any suggestion?

A: 

I ran across this while working on an SQLite project a few months back and found it enlightening. I think it might be what you're looking for.

...

Fastest universal way to insert data using standard ADO.NET constructs

Now that the slow stuff is out of the way, lets talk about some hardcore bulk loading. Aside from SqlBulkCopy and specialized constructs involving ISAM or custom bulk insert classes from other providers, there is simply no beating the raw power of ExecuteNonQuery() on a parameterized INSERT statement. I will demonstrate:

internal static void FastInsertMany(DbConnection cnn)
{

    using (DbTransaction dbTrans = cnn.BeginTransaction())
    {

        using (DbCommand cmd = cnn.CreateCommand())
        {

            cmd.CommandText = "INSERT INTO TestCase(MyValue) VALUES(?)";

            DbParameter Field1 = cmd.CreateParameter();

            cmd.Parameters.Add(Field1);

            for (int n = 0; n < 100000; n++)
            {

                Field1.Value = n + 100000;

                cmd.ExecuteNonQuery();

            }

        }

        dbTrans.Commit();

    }

}
CptSkippy
Answers that are simple references to external sources aren't all that helpful. I'd much rather that you include a summary of the external link in case the link is ever broken. If you include a summary, your answer will still have value.
tvanfosson
Tried the code on my code, and man, it makes quite some significant difference! Thanks! I did change form using DBTransaction to SQLTransaction. BTW, I find it strange that the code run FASTER when in Transaction, cause I explicitly tried comment out the Transaction statements, and the code ran slower. I wonder why.
Statements run against a SQL Database are implicitly run inside a Transaction. By declaring the Transaction you are effectively batching all of the statements together so SQL Server doesn't initiate a separate transaction for each statement. I would imagine that in addition to that, SQL can also further optimize internal processes, like updating indexes, by postponing them until after all statements in the transaction are completed.
CptSkippy
A: 

You could return a table with 2 columns (COLLECTION_NAME nvarchar(max), CONTENT xml) filled with as many rows as internal collections you have. CONTENT will be an XML representation of the data in the collection.

Then you can use the XML features of SQL 2005/2008 to parse each collection's XML into tables, and perform your INSERT INTO's or MERGE statements on the whole table.

That should be faster than individual INSERTS inside your C# code.

Nestor