views:

708

answers:

11

Edit: Let me completely rephrase this, because I'm not sure there's an XML way like I was originally describing.

Yet another edit: This needs to be a repeatable process, and it has to be able to be set up in a way that it can be called in C# code.

In database A, I have a set of tables, related by PKs and FKs. A parent table, with child and grandchild tables, let's say.

I want to copy a set of rows from database A to database B, which has identically named tables and fields. For each table, I want to insert into the same table in database B. But I can't be constrained to use the same primary keys. The copy routine must create new PKs for each row in database B, and must propagate those to the child rows. I'm keeping the same relations between the data, in other words, but not the same exact PKs and FKs.

How would you solve this? I'm open to suggestions. SSIS isn't completely ruled out, but it doesn't look to me like it'll do this exact thing. I'm also open to a solution in LINQ, or using typed DataSets, or using some XML thing, or just about anything that'll work in SQL Server 2005 and/or C# (.NET 3.5). The best solution wouldn't require SSIS, and wouldn't require writing a lot of code. But I'll concede that this "best" solution may not exist.

(I didn't make this task up myself, nor the constraints; this is how it was given to me.)

A: 

Dump the XML approach and use the import wizard / SSIS.

jms
We'd prefer to stay away from DTS/SSIS if possible. We're looking for something that (appropriately packaged in code) is easy for an end user to run by picking a set of items to import from a list.
Kyralessa
A: 

First, let me say that SSIS is your best bet. But, to answer the question you asked...

I don't believe you will be able to get away with creating new id's all around, although you could but you would need to take the original IDs to use for lookups.

The best you can get is one insert statement for table. Here is an example of the code to do SELECTs to get you the data from your XML Sample:

declare @xml xml 
set @xml='<People Key="1" FirstName="Bob" LastName="Smith">
  <PeopleAddresses PeopleKey="1" AddressesKey="1">
    <Addresses Key="1" Street="123 Main" City="St Louis" State="MO" ZIP="12345" />
  </PeopleAddresses>
</People>
<People Key="2" FirstName="Harry" LastName="Jones">
  <PeopleAddresses PeopleKey="2" AddressesKey="2">
    <Addresses Key="2" Street="555 E 5th St" City="Chicago" State="IL" ZIP="23456" />
  </PeopleAddresses>
</People>
<People Key="3" FirstName="Sally" LastName="Smith">
  <PeopleAddresses PeopleKey="3" AddressesKey="1">
    <Addresses Key="1" Street="123 Main" City="St Louis" State="MO" ZIP="12345" />
  </PeopleAddresses>
</People>
<People Key="4" FirstName="Sara" LastName="Jones">
  <PeopleAddresses PeopleKey="4" AddressesKey="2">
    <Addresses Key="2" Street="555 E 5th St" City="Chicago" State="IL" ZIP="23456" />
  </PeopleAddresses>
</People>
'

select t.b.value('./@Key', 'int') PeopleKey,
    t.b.value('./@FirstName', 'nvarchar(50)') FirstName,
    t.b.value('./@LastName', 'nvarchar(50)') LastName
from @xml.nodes('//People') t(b)

select t.b.value('../../@Key', 'int') PeopleKey,
    t.b.value('./@Street', 'nvarchar(50)') Street,
    t.b.value('./@City', 'nvarchar(50)') City,
    t.b.value('./@State', 'char(2)') [State],
    t.b.value('./@Zip', 'char(5)') Zip
from 
@xml.nodes('//Addresses') t(b)

What this does is take Nodes from the XML and parse out the data. To get the relational id from people we use ../../ to go up the chain.

Josef
A: 

By far the easiest way is Red Gate's SQL Data Compare. You can set it up to do just what you described in a minute or two.

Jonathan Allen
A: 

I love Red Gate's SQL Compare and Data Compare too but it won't meet his requirements for the changing primary keys as far as I can tell.

If cross database queries/linked servers are an option you could do this with a stored procedure that copies the records from parent/child in DB A into temporary tables on DB B and then add a column for the new primary key in the temp child table that you would update after inserting the headers.

My question is if the records don't have the same primary key how do you tell if it's a new record? Is there some other candidate key? If these are new tables why can't they have the same primary key?

Mike L
A: 

I have created the same thing with a set of stored procedures.

Database B will have its own primary keys, but store Database A's primary keys, for debuging purposes. It means I can have more than one Database A!

Data is copied via a linked server. Not too fast; SSIS is faster. But SSIS is not for beginners, and it is not easy to code something that works with changing source tables.

And it is easy to call a stored procedure from C#.

A: 

I'd script it in a Stored Procedure, using Inserts to do the hard work. Your code will take the PKs from Table A (presumably via @@Scope_Identity) - I assume that the PK for Table A is an Identity field?

You could use temporary tables, cursors or you might prefer to use the CLR - it might lend itself to this kind of operation.

I'd be surprised to find a tool that could do this off the shelf with either a) pre-determined keys, or b) identity fields (clearly Tables B & C don't have them).

CJM
A: 

Are you clearing the destination tables each time and then starting again? That will make a big difference to the solution you need to implement. If you are doing a complete re-import each time then you could do something like the following:

Create a temporary table or table variable to record the old and new primary keys for the parent table.

Insert the parent table data into the destination and use the OUTPUT clause to capture the new ID's and insert them with the old IDs into the temp table. NOTE: Using the output clause is efficient and allows you to do the insert in bulk without cycling through each record to be inserted.

Insert the child table data. Join to the temp table to retrieve the new foreign key required.

The above process could be done using T-SQL Script, C# code or SSIS. My preference would be for SSIS.

Dr8k
Nope, I'm adding to existing data in the tables.
Kyralessa
A: 

If you are adding each time then you may need to keep a permanent table to track the relationship between source database primary keys and destination database primary keys (at least for the parent table). If you needed to keep this kind of data out of the destination database, you could get SSIS to store/retrieve it from some kind of logging database or even a flat file.

You could probably avoid the above scenario if there is a combination of fields in the parent table that can be used to uniquely identify that record and therefore "find" the primary key for that record in the destination database.

Dr8k
A: 

I think most likely what I'm going to use is typed datasets. It won't be a generalized solution; we'll have to regenerate them if any of the tables change. But based on what I've been told, that's not a problem; the tables aren't expected to change much.

Datasets will make it reasonably easy to loop through the data hierarchically and refresh PKs from the database after insert.

Kyralessa
A: 

When dealing with similar tasks I simply created a set of stored procedures to do the job.

As the task that you specified is pretty custom, you are not likely to find "ready to use" solution.

Just to give you some hints:

  • If the databases are on different servers use linked servers so you can access both source and destination tables simply through TSQL

In the stored procedure:

  • Identify the parent items that need to be copied - you said that the primary keys are different so you need to use unique constraints instead (you should be able to define them if the tables are normalised)
  • Identify the child items that need to be copied based on the identified parents, to check if some of them are already in the destination db use the unique constraints approach again
  • Identify the grandchild items (same logic as with parent-child)
  • Copy data over starting with the lowest level (grandchildren, children, parents)

There is no need for cursors etc, simply store the immediate results in the temporary table (or table variable if working within one stored procedure)

That approach worked for me pretty well.

You can of course add parameter to the main stored procedure so you can either copy all new records or only ones that you specify.

Let me know if that is of any help.

kristof
+2  A: 

I think the SQL Server utility tablediff.exe might be what you are looking for.

See also this thread.

Sklivvz