tags:

views:

371

answers:

7

Hi folks,

i'm trying to migrate some data from two tables in an OLD database, to a NEW database.

The problem is that I wish to generate new Primary Key's in the new database, for the first table that is getting imported. That's simple.

But the 2nd table in the old database has a foreign key dependency on the first table. So when I want to migrate the old data from the second table, the foreign key's don't match any more.

Are there any tricks/best practices involved to help me migrate the data?

Serious Note: i cannot change the current schema of the new tables, which do not have any 'old id' column.

Lets use the following table schema :-

Old Table1              New Table1
ParentId INT PK         ParentId INT PK
Name VARCHAR(50)        Name VARCHAR(50)

Old Table 2             New Table 2
ChildId INT PK          ChildId INT PK
ParentId INT FK         ParentId INT FK     
Foo VARCHAR(50)         Foo VARCHAR(50)

So the table schema's are identical.

Thoughts?

EDIT:

For those that are asking, RDBMS is Sql Server 2008. I didn't specify the software because i was hoping i would get an agnostic answer with some generic T-Sql :P

A: 

Can you change the schema of the old tables? If so, you could put a "new id" column on the old tables, and use that as the reference.

You might have to do a row by row insert on the new table and then retrieve the scope_identity, store it in the old table1. But for table2, you can then join to the old table1 and grab the new_id.

Dane
I was worried that I would have to do this with a cursor. I can't change the schema's. (But I can use temp tables, of course).
Pure.Krome
Yeah, cursor or something like SSIS - can have a couple of data flows. First one selects * from table A, goes through an OLEDB transform (an insert which then selects scope_identity into a new column in the data flow) and then saves the table a record with a new id field into a temp table.Then, second data flow retrieves table b, and does a lookup on old_id to the temp table
Dane
+3  A: 

I think you need to do this in 2 steps.

You need to import the old tables and keep the old ids (and generate new ones). Then once they're in the new database and they have both new and old ids you can use the old Id's to get associate the new ids, then you drop the old ids.

You can do this by importing into temporary (i.e. they will be thrown away) tables, then inserting into the permanent tables, leaving out the old ids.

Or import directy into the new tables (with schema modified to also hold old ids), then drop the old id's when they're no longer necessary.

EDIT:

OK, I'm a bit clearer on what you're looking for thanks to comments here and on other answers. I knocked this up, I think it'll do what you want.

Basically without cursors it steps through the parent table, row by row, and inserts the new partent row, and all the child rows for that parent row, keeping the new id's in sync. I tried it out and it should work, it doesn't need exclusive access to the tables and should be orders of magniture faster than a cursor.

declare @oldId as int
declare @newId as int

select @oldId = Min(ParentId) from OldTable1 

while not @oldId is null
begin
    Insert Into NewTable1 (Name)
    Select Name from OldTable1 where ParentId = @oldId

    Select @newId = SCOPE_IDENTITY()

    Insert Into NewTable2 (ParentId, Foo)
    Select @newId, Foo From OldTable2 Where ParentId = @oldId

    select @oldId = Min(ParentId) from OldTable1 where ParentId > @oldId

end

Hope this helps,

Binary Worrier
Currently, 'New Table 1' already has some data in it. So if I insert the old table data into a temp table (eg #blah, or whatever) or a variable table, i'm not sure how i can then associate that with the rows which I insert into the final, 'New Table 1'. Are you saying my temp table should have 3 fields? Identity PK, (Old)ParentId, (New)ParentId? i'm just not sure how i can find out what the new id's are when i do the insert.?
Pure.Krome
+1  A: 

Well, I guess you'll have to determine other criteria to create a map like oldPK => newPK (for example: Name field is equal?

Then you can determine the new PK that matches the old PK and adjust the ParentID accordingly.

You may also do a little trick: Add a new column to the original Table1 which stores the new PK value for a copied record. Then you can easily copy the values of Table2 pointing them to the value of the new column instead of the old PK.

EDIT: I'm trying to provide some sample code of what I meant by my little trick. I'm not altering the original database structure, but I'm using a temporary table now.

OK, you might try to following:

1) Create temporary table that holds the values of the old table, plus, it gets a new PK:

CREATE TABLE #tempTable1
(
    newPKField INT,
    oldPKField INT,
    Name VARCHAR(50)
)

2) Insert all the values from your old table into the temporary table calculating a new PK, copying the old PK:

INSERT INTO #tempTable1
SELECT
   newPKValueHere AS newPKField,
   ParentID as oldPKField,
   Name
FROM
   Table1

3) Copy the values to the new table

INSERT INTO NewTable1
SELECT
   newPKField as ParentId,
   Name
FROM
   #tempTable1

4) Copy the values from Table2 to NewTable2

INSERT INTO NewTable2
SELECT
   ChildID,
   t.newPKField AS ParentId,
   Foo
FROM 
   Table2
   INNER JOIN #tempTable1 t ON t.ParentId = parentId

This should do. Please note that this is only pseudo T-SQL Code - I have not tested this on a real database! However, it should come close to what you need.

Thorsten Dittmar
I don't understand 'the little trick'. Can u explain with some code?
Pure.Krome
That's cute - changing the schema of the *old* database and doing the grunt work there instead.
lc
@Pure.Krome, I think the "little trick" is to add a "new id" column to the OLD database, then do an insert using this data when you go to copy Table2.
lc
@Pure.Krome: I mean like the following:1) Add new column newPK to original Table1 2) Copy all items in Table1 to new Table1, updating every entry in old Table1 with the new ID in the newPK column3) Copy all items in Table2, referencing the value in the newPK column of Table1
Thorsten Dittmar
By the way: You do not even have to modify the original table. You can create a temporary in memory table with all the fields in Table1 that has a new column. Then select into this temp table. Then copy from temp table to new Table1 updating the newPK field, then copy entries in Table2.
Thorsten Dittmar
Hmm. i still don't get it. Can u show some code please? i can't see how the old table1 'knows' about the new id. If it's in a cursor, then sure .. but no one has said anything about a cursor (yet) :)
Pure.Krome
Updated the above. Please note: You can change the declaration of the temporary table to generate auto-IDs for the new PK values as well! I'm assuming that you're just doing "something" to get the new PK value.
Thorsten Dittmar
A: 

First of all - can you not even have some temporary schema that you can later drop?! That would make life easier. Assuming you can't:

If you're lucky (and if you can guarantee that no other inserts will be happening at the same time) then when you insert the Table1's data into your new table you could perhaps cheat by relying on the sequential order of the inserts.

You could then create a view that joins the 2 tables on a row-count so that you have a way to correlate the keys to each other. That way you'd be one step closer to being able to identify the 'ParentId' for the new Table2.

rohancragg
A: 

I'm not sure from your question what database software you're using, but if temporary tables are an option, create a temporary table containing the original primary key of table1 and the new primary key of table1. Then create another temporary table with a copy of table2, update the copy using the "old key, new key" table you created earlier, then use "insert into select from" (or whatever the appropriate command is for your database) to copy the revised temporary table into its permanent location.

Scott A. Lawrence
I didn't think the DB software was important - i was hoping to get some T-SQL (software independent) answer. That said, how do u insert the new PK into this temp table for each old ID? Are u suggesting a cursor also?
Pure.Krome
I'm not sure a cursor is required for this. A combination of user-defined functions and the UPDATE...FROM construct supported by T-SQL might do the trick. If you don't get an acceptable answer by noon Eastern time, I'll try to mock something up during lunch.
Scott A. Lawrence
Cheers mate :) I'm currently playing with a cursor solution .. cause that's all I can see how this could be done (right now). Much appreciated.
Pure.Krome
I hope the cursor solution works out. I'll probably end up trying to do a non-cursor one anyway, just to see if it can be done.
Scott A. Lawrence
Cheers mate :) lookng forward to it :)
Pure.Krome
A: 

I had the wonderful opportunity to be dug deep in migration scripts last summer. I was using Oracle's PL/SQL for the task. But you did not mention what technology are you using? What are you migrating the data into? SQL Server? Oracle? MySQL?

The approach is to INSERT a row from table1 RETURING the new primary key generated (probably by a SEQUENCE [in Oracle]) and then INSERT the dependent records from table2, changing their foreign key value to the value returned by the first INSERT. Can't help you any better unless you can specify what DBMS are you migrating data into.

Peter Perháč
A: 

The following Pseudo-ish code should work for you

CREATE TABLE newtable1
  ParentId INT PK
  OldId   INT
  Name     VARCHAR(50)

CREATE TABLE newtable2
  ChildId    INT pk
  ParentId   INT FK
  OldParent  INT
  Foo        VARCHAR(50)   

  INSERT INTO newtable1(OldId, Name)
  SELECT  ParentId, Name FROM oldtable1

  INSERT INTO newtable2(OldParent, Foo)
  SELECT ParentId, Foo FROM oldtable2

UPDATE newtable2 SET ParentId = (
  SELECT n.ParentId 
  FROM   newtable1 AS n
  WHERE  n.OldId = newtable2.oldParent
)

ALTER TABLE newtable1 DROP OldId
ALTER TABLE newtable2 DROP OldParent
Steve Weet
Hmm. I can't see how the Update clause would work. You're trying to set the ParentId .. which i don't want to do. I need that to be auto generated ... ???
Pure.Krome
Surely on the second table you want the childId auto-generated and the ParentId to be the primary key of the parent table.
Steve Weet