views:

236

answers:

5

Let's say my table structure looks something like this:

CREATE TABLE [dbo].[table1] (
    [id] [int] IDENTITY(1,1) NOT NULL,
    [data] [varchar](255) NOT NULL,
    CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED ([id] ASC)
)

CREATE TABLE [dbo].[table2] (
    [id] [int] IDENTITY(1,1) NOT NULL,
    [table1_id] [int] NOT NULL,
    [data] [varchar](255) NOT NULL,
    CONSTRAINT [PK_table2] PRIMARY KEY CLUSTERED ([id] ASC)
)

The [id] field of the first table corresponds to the [table1_id] field of the second. What I would like to do is insert data into both tables in a single transaction. Now I already know how to do this by doing INSERT-SELECT-INSERT, like this:

BEGIN TRANSACTION;
DECLARE @id [int];
INSERT INTO [table1] ([data]) VALUES ('row 1');
SELECT @id = SCOPE_IDENTITY();
INSERT INTO [table2] ([table1_id], [data]) VALUES (@id, 'more of row 1');
COMMIT TRANSACTION;

That's all good and fine for small cases like that where you're only inserting maybe a handful of rows. But what I need to do is insert a couple hundred thousand rows, or possibly even a million rows, all at once. The data is coming from another table, so if I was only inserting it into a single table, it would be easy, I'd just have to do this:

INSERT INTO [table] ([data])
SELECT [data] FROM [external_table];

But how would I do this and split the data into [table1] and [table2], and still update [table2] with the appropriate [table1_id] as I'm doing it? Is that even possible?

+7  A: 

Try this:

insert into [table] ([data])
output inserted.id, inserted.data into table2
select [data] from [external_table]

UPDATE: Re:

Denis - this seems very close to what I want to do, but perhaps you could fix the following SQL statement for me? Basically the [data] in [table1] and the [data] in [table2] represent two different/distinct columns from [external_table]. The statement you posted above only works when you want the [data] columns to be the same.

INSERT INTO [table1] ([data]) 
OUTPUT [inserted].[id], [external_table].[col2] 
INTO [table2] SELECT [col1] 
FROM [external_table] 

It's impossible to output external columns in an insert statement, so I think you could do something like this

merge into [table1] as t
using [external_table] as s
on 1=0 --modify this predicate as necessary
when not matched then insert (data)
values (s.[col1])
output inserted.id, s.[col2] into [table2]
;
Denis Valeev
Denis, I've only used OUTPUT to write to table variable. Can you use it to insert directly into a live table?
Bill
@Bill you betcha!
Denis Valeev
By the way, there's no need to embrace it in `begin tran... commit tran` statements, as it's plainly going to be run in a single transaction.
Denis Valeev
Denis - this seems very close to what I want to do, but perhaps you could fix the following SQL statement for me? Basically the [data] in [table1] and the [data] in [table2] represent two different/distinct columns from [external_table]. The statement you posted above only works when you want the [data] columns to be the same. `INSERT INTO [table1] ([data]) OUTPUT [inserted].[id], [external_table].[col2] INTO [table2] SELECT [col1] FROM [external_table]`
SoaperGEM
I initially got excited by this but have since run into this error: "The target table of the OUTPUT INTO clause cannot have any enabled check constraints" -- no chance of that happening! Oh well, back to using a staging temp table or table variable to store the output. Nice to know anyhow :)
onedaywhen
@onedaywhen Ahahaha! :))) Yeah, there's a lot of things that need not be in that output table (fk, constraints, triggers), so basically it's useless except for staging tables.
Denis Valeev
Thank you so much Denis. This is what I was looking for, so I accepted your answer. However, just for the sake of completeness, and if you care to... is there any way to do this in SQL Server 2005, where MERGE was not available?
SoaperGEM
@SoaperGEM Only if you created this `col2` column in the `table1`. Then you would be able to output `inserted.col2` into the `table2`.
Denis Valeev
A: 

You could write a stored procedure that iterates over the transaction that you have proposed. The iterator would be the cursor for the table that contains the source data.

mlschechter
Never iterate when there is a set-based solution.
HLGEM
@HLGEM - given my error, which of the proposed solutions would you recommend?
mlschechter
Definitely the output clause to a temp table.
HLGEM
A: 
BEGIN TRANSACTION;

DECLARE @tblMapping table(sourceid int, destid int)

INSERT INTO [table1] ([data]) 
OUTPUT source.id, new.id
Select [data] from [external_table] source;

INSERT INTO [table2] ([table1_id], [data])
Select map.destid, source.[more data] 
from [external_table] source
    inner join @tblMapping map on source.id=map.sourceid;

COMMIT TRANSACTION;
Bill
Given Denis's response to my comment, his solution is much cleaner than mine.
Bill
@Bill You can't use `source.id, new.id` in the `output` clause. You are only allowed to use `inserted.*` in there for `insert`. For `delete`, `update` and `merge` it is possible to include a column from the specified table.
Denis Valeev
A: 

Keep a look out for SQL Server to support the 'INSERT ALL' Statement. Oracle has it already, it looks like this (SQL Cookbook):

insert all
  when loc in ('NEW YORK', 'BOSTON') THEN
   into dept_east(deptno, dname, loc) values(deptno, dname, loc)
  when loc in ('CHICAGO') THEN
   into dept_mid(deptno, dname, loc) values(deptno, dname, loc)
  else
   into dept_west(deptno, dname, loc) values(deptno, dname, loc)
select deptno, dname, loc
  from dept
Brian
A: 

Another option is to run the two inserts separately, leaving the FK column null, then running an update to poulate it correctly.

If there is nothing natural stored within the two tables that match from one record to another (likely) then create a temporary GUID column and populate this in your data and insert to both fields. Then you can update with the proper FK and null out the GUIDs.

E.g.:

CREATE TABLE [dbo].[table1] ( 
    [id] [int] IDENTITY(1,1) NOT NULL, 
    [data] [varchar](255) NOT NULL, 
    CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED ([id] ASC),
    JoinGuid UniqueIdentifier NULL
) 

CREATE TABLE [dbo].[table2] ( 
    [id] [int] IDENTITY(1,1) NOT NULL, 
    [table1_id] [int] NULL, 
    [data] [varchar](255) NOT NULL, 
    CONSTRAINT [PK_table2] PRIMARY KEY CLUSTERED ([id] ASC),
    JoinGuid UniqueIdentifier NULL
) 


INSERT INTO Table1....

INSERT INTO Table2....

UPDATE b
SET table1_id = a.id
FROM Table1 a
JOIN Table2 b on a.JoinGuid = b.JoinGuid
WHERE b.table1_id IS NULL

UPDATE Table1 SET JoinGuid = NULL
UPDATE Table2 SET JoinGuid = NULL
ck