views:

2135

answers:

4

I have to copy a bunch of data from one database table into another. I can't use SELECT ... INTO because one of the columns is an identity column. Also, I have some changes to make to the schema. I was able to use the export data wizard to create an SSIS package, which I then edited in Visual Studio 2005 to make the changes desired and whatnot. It's certainly faster than an INSERT INTO, but it seems silly to me to download the data to a different computer just to upload it back again. (Assuming that I am correct that that's what the SSIS package is doing). Is there an equivalent to BULK INSERT that runs directly on the server, allows keeping identity values, and pulls data from a table? (as far as I can tell, BULK INSERT can only pull data from a file)

Edit: I do know about IDENTITY_INSERT, but because there is a fair amount of data involved, INSERT INTO ... SELECT is kinda of slow. SSIS/BULK INSERT dumps the data into the table without regards to indexes and logging and whatnot, so it's faster. (Of course creating the clustered index on the table once it's populated is not fast, but it's still faster than the INSERT INTO...SELECT that I tried in my first attempt)

Edit 2: The schema changes include (but are not limited to) the following: 1. Splitting one table into two new tables. In the future each will have its own IDENTITY column, but for the migration I think it will be simplest to use the identity from the original table as the identity for the both new tables. Once the migration is over one of the tables will have a one-to-many relationship to the other. 2. Moving columns from one table to another. 3. Deleting some cross reference tables that only cross referenced 1-to-1. Instead the reference will be a foreign key in one of the two tables. 4. Some new columns will be created with default values. 5. Some tables aren’t changing at all, but I have to copy them over due to the "put it all in a new DB" request.

+1  A: 

I think SELECT...INTO should work with an IDENTITY column. You may need to redefine the primary key:

SELECT * INTO NewTable FROM OldTable
GO
ALTER TABLE NewTable ADD PRIMARY KEY(ColumnName)

If that won't work, you can generate a CREATE TABLE script for the old table, change the name to create the new table, and then use IDENTITY_INSERT to allow copying the primary key data from the first table using an INSERT INTO NewTable SELECT FROM OLDTABLE. Then you can do your other manipulation on the server in SQL.

One nice benefit is that you can test this script locally or on a test server, and can repeat it if need be just by re-running the script.

Are your schema changes too complex to allow changing via script?

Jon Galloway
Can you give an example of an insert that combines SELECT...INTO and IDENTITY_INSERT?
steve_d
+3  A: 

I think you may be interested in Identity Insert

Ryan Ische
Is there a way to use Identity_insert with a bulk inserting technology? normal inserting (INSERT INTO ... SELECT) is too slow for my purposes.
steve_d
+1  A: 

Please check with this,

Select * Into NewTable From OldTable Where 1=2

Alter Table NewTable Add id_col int indentity(1,1)

insert into NewTable(col1,col2,..... ) /* do not use id_col */ select col1,col2,..... from OldTable

Md Tariq-ul Islam
A: 

Since so many people have looked at this question, I thought I should follow up.

I ended up sticking with the SSIS package. I executed it on the database server itself. It still went through the rigmarole of pulling data from the the sql process to the SSIS process, then sending it back. But overall it executed faster than other options I investigated.

Also, I ran into a bug: when pulling data from a view, the package would just hang. I ended up cutting and pasting the query from the view directly into the "sql query" field of the "source" object in SSIS. This only seemed to happen when the package was running on the same machine as the server. When running from a different machine, I did not run into this error.

If I had to do it all over again, I would probably generate new identity values. I would migrate the old ones to a column in the new table, use those values to associate the other tables' foreign keys, and then I would delete the column once the migration was complete and stable. On the other hand, overall the SSIS package method worked fine, so if you have to do a complex migration (splitting up tables etc.) or need to keep the identity values intact, I would recommend it.

Thanks to everyone who responded.

steve_d