views:

825

answers:

3

Nightly, I need to fill a SQL Server 2005 table from an ODBC source with over 8 million records. Currently I am using an insert statement from linked server with syntax select similar to this:

Insert Into SQLStagingTable from Select * from OpenQuery(ODBCSource, 'Select * from SourceTable')

This is really inefficient and takes hours to run. I'm in the middle of coding a solution using SqlBulkInsert code similar to the code found in this question.

The code in that question is first populating a datatable in memory and then passing that datatable to the SqlBulkInserts WriteToServer method.

What should I do if the populated datatable uses more memory than is available on the machine it is running (a server with 16GB of memory in my case)?

I've thought about using the overloaded ODBCDataAdapter fill method which allows you to fill only the records from x to n (where x is the start index and n is the number of records to fill). However that could turn out to be an even slower solution than what I currently have since it would mean re-running the select statement on the source a number of times.

What should I do? Just populate the whole thing at once and let the OS manage the memory? Should I populate it in chunks? Is there another solution I haven't thought of?

+4  A: 

The easiest way would be to use ExecuteReader() against your odbc data source and pass the IDataReader to the WriteToServer(IDataReader) overload.

Most data reader implementations will only keep a very small portion of the total results in memory.

Sam Saffron
Thanks. I'll give that a shot.
Chad Braun-Duin
+1  A: 

SSIS performs well and is very tweakable. In my experience 8 million rows is not out of its league. One of my larger ETLs pulls in 24 million rows a day and does major conversions and dimensional data warehouse manipulations.

Cade Roux
I really need to take the time to learn SSIS. I hear good things about it, but there definitely is more of a learning curve to it than there was for DTS.
Chad Braun-Duin
Yes, the learning curve is much steeper, however, in a few weeks it can really pay you back.
Cade Roux
A: 

If you have indexes on the destination table, you might consider disabling those till the records get inserted?

shahkalpesh
In this case, I'm populating a newly created temporary table, then updating the live table once the temp table has been created, then dropping the temp table. So I don't have any indexes on the temp table.
Chad Braun-Duin