views:

362

answers:

4

Hello everyone,

I have two databases called DB1 and DB2 on the same database server. I have Table1 in DB2 and Table2 in DB2. Currently, I am using insert into select * to transfer all data from Table2 into Table1 (Table1 is empty, my purpose is to make a copy of data from Table2 to Table1). The table structure is clustered ID column (of type GUID) and an XML binary (varbinary) data column.

My current issue is, the memory consumption is very high. Are there any good ideas to reduce the memory consumption? My rough idea is I can initialize a couple of small transactions and selct insert partial data from each transaction.

I am using VSTS 2008 + C# + ADO.Net + SQL Server 2008 Enterprise. Any good solutions or reference samples?

Here is my current code which causes out of memory exception. I am using ADO.Net SQLBulkCopy feature.

namespace BulkCopyTable
{
    public class CopyData
    {
        string _sourceConnectionString;
        string _destinationConnectionString;

        public CopyData(string sourceConnectionString,
                        string destinationConnectionString)
        {
            _sourceConnectionString =
                        sourceConnectionString;
            _destinationConnectionString =
                        destinationConnectionString;
        }

        public void CopyTable(string table)
        {
            using (SqlConnection source =
                    new SqlConnection(_sourceConnectionString))
            {
                string sql = string.Format("SELECT * FROM [{0}]", table);

                SqlCommand command = new SqlCommand(sql, source);

                source.Open();
                IDataReader dr = command.ExecuteReader();

                using (SqlBulkCopy copy =
                        new SqlBulkCopy(_destinationConnectionString))
                {
                    copy.DestinationTableName = table;
                    copy.WriteToServer(dr);
                }
            }
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            CopyData copier = new CopyData(ConfigurationSettings.AppSettings["source"], ConfigurationSettings.AppSettings["destination"]);
            Console.WriteLine("Begin Copy");
            copier.CopyTable(ConfigurationSettings.AppSettings["Table"]);
            Console.WriteLine("End Copy");
            return;
        }
    }
}

thanks in advance, George

+2  A: 

You could try using the BCP utility.

This can be run in C# using the Process class, if needed.

skalburgi
Thanks skalburgi, I am currently using SQL Bulk Copy and I think it should be the same as BCP? http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspxI am doing Bulk Copy at one time to copy all data from source table to destination, which will cause out of memory error. Any ideas to split the process into several sub-process? I think split into several sub-process may help to reduce the memory consumption?
George2
How about breaking it up into runs where you select a GUID range from Table1? (Assuming you have a random distribution)For example: Start: 00000000-0000-0000-0000-0000000000End: AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAA
skalburgi
Hi skalburgi, 1. my GUID is random, any good ideas to split? :-(2. I used BCP for bulk insert from a csv file into a table, could bulk insert be used to insert from a table to another table?
George2
+2  A: 

Is this a one time job? If so, you could use DTS or SSIS.
If not, see if you can use SQLBulkCopy & related classes in the framework

EDIT: I saw your code & can suggest using BatchSize property before the call to WriteToServer.

shahkalpesh
I have tried to use this class, but still run out of memory. Any ideas to split the work in SQL Bulk Copy?
George2
shahkalpesh
I have posted my code into my original reply. Any ideas how to solve this issue to make it use less memory?
George2
+1  A: 

Would setting up a cursor, stepping through each row in your source table and "insert-into-selecting" for each fetched row, use less memory? The BOL has plenty of examples of stepping through cursors.


Update: Here's an example I copied/modified from the BOL T-Sql reference on FETCH: (the comments are from the BOL article, i just changed around a few names)

-- // Declare the variables to store the values returned by FETCH.
DECLARE @id uniqueidentifier, @xml varbinary(4000)

DECLARE myCursor CURSOR FOR
SELECT id, xml FROM Table1

OPEN myCursor 

-- // Perform the first fetch and store the values in variables.
-- // Note: The variables are in the same order as the columns
-- // in the SELECT statement. 

FETCH NEXT FROM myCursor
INTO @id, @xml

-- // Check @@FETCH_STATUS to see if there are any more rows to fetch.
WHILE @@FETCH_STATUS = 0
    BEGIN

    -- // Do something with your values!
    INSERT INTO db2..Table2 (id, xml) SELECT @id, @xml

    -- // This is executed as long as the previous fetch succeeds.
    FETCH NEXT FROM myCursor
    INTO @id, @xml

    END

CLOSE myCursor 
DEALLOCATE myCursor
Funka
I like your idea of setting up a cursor, any reference samples?
George2
i've updated my answer
Funka
This will take less memory but will take considerably longer.
HLGEM
+1  A: 

I think you want to set the batch size so that you can chunk them into manageable pieces.

Bulk copy batch size

Check this Bulk loading data whitepaper for other techniques: Bulk load

JasonHorner
Thanks JasonHorner, 1. I want to confirm with you the bulk batch size. Suppose I have 1M record and I set batch size to 500K, then after insert the first 500K batch, SQL Server will proceed to insert the latter 500K batch other than restart from beginning, correct? 2. The Bulk load document is very great! But it is for SQL Server 2000, any related documents for a newer SQL Server version?
George2
yes it commits each batch as a separate transaction. try this http://blogs.msdn.com/sqlcat/archive/2009/02/12/the-data-loading-performance-guide-now-available-from-msdn.aspx I think for the most part the older material is still useful.
JasonHorner
Thanks! Really cool documents. I will make more testing and give you feedback here. :-)
George2