ansaurus

Question

Answer 1

+8 A:

Batch update...

INSERT INTO DestinationTable
    (ColumnA, ColumnB, ColumnC, etc.)
SELECT TOP 100000 ColumnA, ColumnB, ColumnC, etc.
FROM SourceTable
WHERE NOT EXISTS (SELECT *
    FROM DestinationTable
    WHERE DestinationTable.KeyCols = SourceTable.KeyCols)

WHILE @@ROWCOUNT <> 0
    INSERT INTO DestinationTable
        (ColumnA, ColumnB, ColumnC, etc.)
    SELECT TOP 100000 ColumnA, ColumnB, ColumnC, etc.
    FROM SourceTable
    WHERE NOT EXISTS (SELECT *
        FROM DestinationTable
        WHERE DestinationTable.KeyCols = SourceTable.KeyCols)

There are variations to deal with checkpointing, log file management, if you need it in one txn etc

gbn 2009-09-25 19:36:49

I think my suggestion of a cursor might be slightly less complicated than this. I will try both and see which performs better.

Jonathan.Peppers 2009-09-25 19:43:49

I think a cursor would take a year and a half to insert all your data ;)

womp 2009-09-25 19:45:07

@Jonathan.Peppers: a CURSOR still needs resources, locks, perhaps 22m rows in tempdb depends how you declare it

gbn 2009-09-25 19:50:49

@gbn, you could eliminate the duplication of the insert by replacing the outer _INSERT with SELECT 1_ because it will generate a @@ROWCOUNT to get the loop going the first time

KM 2009-09-25 19:58:56

@KM: doh! of course.

gbn 2009-09-26 04:18:18

Answer 2

+1 A:

This blog post has info about importing data into SQL Server.

As for the reason you table is filling up, I would look at the schema of the table, and make sure there are the column sizes are as small as they can possibly be.

I would really analyze if all the data is necessary.

David Basarab 2009-09-25 19:38:11

The reason I'm moving the table is because I'm making my table structure as small as possible. The data is imported by a 3rd party, or we would not have this issue.

Jonathan.Peppers 2009-09-25 19:42:18

Answer 3

+1 A:

You can bulk copy the data to a CSV file and import it in.

Read up the BCP utility here.

Raj More 2009-09-25 19:40:50

22 million rows?

gbn 2009-09-25 19:45:55

I might as well use USPS.

Jonathan.Peppers 2009-09-25 19:52:16

Well I'd use a |delimted .txt instead of a .csv and bulk insert or SSIS but BCP works fine. To Jonathan, I import a 22 million record file to my database using bulk insert and it takes 16 minutes.

HLGEM 2009-09-25 19:57:55

But you're suggesting exporting to a csv and then importing back into SQL server? I'd rather back them up on 1.44 floppies and use the pack and ship promise.

Jonathan.Peppers 2009-09-25 20:02:51

Answer 4

A:

You are inserting data in a way that supports a transaction. There is no way to disable this through the method you're using, however you could do this outside of the scope of a transaction through other methods. Read below:

http://support.microsoft.com/kb/59462

The key approach is this:

DBOPTION 'SELECT INTO' to true

http://www.mssqlcity.com/FAQ/Devel/select%5Finto.htm

Nissan Fan 2009-09-25 19:43:45

Really? This is a standard INSERT, can not be "minimally logged". And since SQL Server 2000 you should use ALTER DATABASE.

gbn 2009-09-25 19:48:00

...and the KB is ancient, and even mentioned when the option applies

gbn 2009-09-25 19:49:35

Answer 5

+2 A:

You could try setting the database recovery model to "Simple" instead of "Full" (the default). This is done on the Options page of the database properties in Management Studio. That should keep your transaction log size down. After you're done the insert you can always set the recovery model back to Full.

TLiebe 2009-09-25 19:52:58

I'll try it and see, I just don't like this solution if it ends up being an automated task.

Jonathan.Peppers 2009-09-25 20:17:17

If the DB is in full recovery and log backups are been taken then a switch to Simple breaks the log chain. This must be very carefully considered in a production environment when point-in-time recovery is required. After switching back to full a full database backup must be taken to restart the log chain and allow further log backups.

GilaMonster 2009-09-26 08:00:52

Answer 6

+1 A:

I would highly recommend you to set the database recovery model to BULK_LOGGED while carrying out such heavy bulk data operations.

By default - database is set to SIMPLE or FULL recovery model.

The full recovery model, which fully logs all transactions, is intended for normal use.

The bulk-logged recovery model is intended to be used temporarily during a large bulk operation— assuming that it is among the bulk operations that are affected by the bulk-logged recovery model (for more information, see Operations That Can Be Minimally Logged at msdn.microsoft.com/en-us/library/ms191244.aspx).

BULK_LOGGED recovery model minimally logs the transactions

you can do it by using below snippet

    --Determine the recovery model currently used for the database

    SELECT name AS [Database Name],
    recovery_model_desc AS [Recovery Model]
    FROM sys.databases 
    WHERE name=<database_name> ;

    --Remember this recovery model so that you can switch back to the same later

    --set the database recovery model to BULK_LOGGED

    ALTER DATABASE <database_name>  SET RECOVERY BULK_LOGGED;

    --Run your heavy data insert tasks
    INSERT INTO DestinationTable
    (ColumnA, ColumnB, ColumnC, etc.)
    SELECT FROM SourceTable
    (ColumnA, ColumnB, ColumnC, etc.)

    /*Again set the database recovery model to FULL or SIMPLE 
    (the result which we had got from first query)*/

    ALTER DATABASE <database_name>  SET RECOVERY FULL;   
    --OR 
    ALTER DATABASE <database_name>  SET RECOVERY SIMPLE;

*Note - Please do keep patience during the bulk operation is being processed * [:P]

I have done this many times before. Do let me know whether this helped you.

You can refer below MSDN article for details of switching between recovery models - Considerations for Switching from the Full or Bulk-Logged Recovery Model at msdn.microsoft.com/en-us/library/ms190203.aspx

Aamod 2009-09-25 20:31:53

I'll try it out, setting it to SIMPLE did not have too much of an effect. It still errored out eventually.

Jonathan.Peppers 2009-09-25 20:40:01

Offcourse.. FULL and SIMPLE recovery models are way too behind BULK_LOGGED in terms of performance for Bulk data operations.

Aamod 2009-09-25 21:07:47

BULK_LOGGED was closer but did not quite get there on my system, and even if it did, wouldn't I have to shrink the database/files to get it down to an acceptable size? I think batching the insert as the top answer suggests, is the way to go.

Jonathan.Peppers 2009-09-25 22:22:31

-1. You don't understand bulklogged. This is not a minimally logged operation.

gbn 2009-09-26 04:16:41

Simple recovery also allows minimal logging for certain operations. It's only full where all operations are fully logged.

GilaMonster 2009-09-26 08:02:59

ansaurus

tags:

views:

answers:

MS SQL Server, multiple insert

related questions