views:

573

answers:

3

I'd like to generate insert-strings for a row in my Oracle database including all its dependent rows in other tables (and their dependent rows).

Example:

CREATE TABLE a (
  a_id number PRIMARY KEY,
  name varchar2(100)
);
CREATE TABLE b (
  b_id number PRIMARY KEY,
  a_id number REFERENCES a(a_id)
);

When I extract the row from a with a_id = 1, the result should be an insert-string for that row and dependent rows:

INSERT INTO a(a_id, name) VALUES (1, 'foo');
INSERT INTO b(b_id, a_id) VALUES (1, 1);
INSERT INTO b(b_id, a_id) VALUES (2, 1);
INSERT INTO b(b_id, a_id) VALUES (3, 1);

The reason why I want to do this is, that I have large database with many different tables and constraints between then and I'd like to extract a small subset of the data as test data.

A: 

I think DBUnit can do this.

Jens Schauder
A: 

I just use plain old SQL to do these tasks - use the select statements to generate your inserts:

set pagesize 0
set verify off

  SELECT 'INSERT INTO a(a_id, name) VALUES ('
         || a_id || ', '
         || '''' || name || ''');'
    FROM a
   WHERE a_id = &&1;

  SELECT 'INSERT INTO b(b_id, a_id) VALUES ('
         || b_id || ', '
         || a_id || ');'
    FROM b
   WHERE a_id = &&1;
Steve Broberg
This assumes the OP knows all of the dependent tables, which I don't think he does. I think he's looking for something that will walk the dependencies and find all the dependent tables.
Joe
You're right for simple cases. Unfortunately, my usual case is much more complicated than in the example. When more than 10 tables are involved in the dependency chain, the manual creation of these selects is quite tedious and error-prone.
Karl Bartel
You can write SQL to generated the statements above from system tables. See my longer reply.
Steve Broberg
+8  A: 

There may be some tool that does it already, but to arbitrarily extract all rows tables from a starting table is a small development task in itself. I can't write the whole thing for you, but I can get you started - I started to write it, but after about 20 minutes, I realized it was a little more work that I wanted to commit to a unpaid answer.

I can see it being done best by a recursive PL/SQL procedure that would use dbms_ouput and user_cons_columns & user_constraints to create inserts statement for the source table. You can cheat a little by writing all the inserts as if the columns were char values, since Oracle will implicitly convert any char values to the right datatype, assuming your NLS parameters are identical on the source & target system.

Note, the package below will have problems if you have circular relationships in your tables; also, on earlier versions of Oracle, you may run out of buffer space with dbms_output. Both problems can be solved by inserting the generated sql into a staging table that has a unique index on the sql, and aborting the recursion if you get a unique key collision. The big time saver below is the MakeParamList function, which converts a cursor that returns a list of columns into either a comma separated list, or a single expression that will display the values of those columns in a quoted, comma separated form when run as the select clause in a query against the table.

Note also that the following package won't really work until you modify it further (one of the reasons I stopped writing it): The initial insert statement generated is based on the assumption that the constraint_vals argument passed in will result in a single row being generated - of course, this is almost certainly not the case once you start recursing (since you will have many child rows for a parent). You'll need to change the generation of the first statement (and the subsequent recursive calls) to be inside a loop to handle the cases where the call to the first EXECUTE IMMEDIATE call generates multiple rows instead of a single one. The basics of getting it working are here, you just need to grind out the details and get the outer cursor working.

One final note also: It is unlikely that you could run this procedure to generate a set of rows that, when inserted into a target system, would result in a "clean" set of data, since although you would get all dependent data, that data may depend on other tables that you didn't import (e.g., the first child table you encounter may have other foreign keys that point to tables unrelated to your initial table). In that case, you may want to start with the detail tables and work your way up instead of down; doing that, you'd also want to reverse the order to the statements you generated, either using a scripting utility, or by inserting the sql into a staging table as I mention above, with a sequence, then selecting it out with a descending sort.

As for invoking it, you pass the comma separated list of columns to constrain as constraint_cols and the corresponding comma separated list of values as constraint_vals, e.g.:

exec Data_extractor.MakeInserts ('MYTABLE', 'COL1, COL2', '99, 105')

Here it is:

CREATE OR REPLACE PACKAGE data_extractor
IS
   TYPE column_info IS RECORD(
      column_name   user_tab_columns.column_name%TYPE
   );

   TYPE column_info_cursor IS REF CURSOR
      RETURN column_info;

   FUNCTION makeparamlist(
      column_info   column_info_cursor
    , get_values    NUMBER
   )
      RETURN VARCHAR2;

   PROCEDURE makeinserts(
      source_table      VARCHAR2
    , constraint_cols   VARCHAR2
    , constraint_vals   VARCHAR2
   );
END data_extractor;


CREATE OR REPLACE PACKAGE BODY data_extractor
AS
   FUNCTION makeparamlist(
      column_info   column_info_cursor
    , get_values    NUMBER
   )
      RETURN VARCHAR2
   AS
   BEGIN
      DECLARE
         column_name   user_tab_columns.column_name%TYPE;
         tempsql       VARCHAR2(4000);
         separator     VARCHAR2(20);
      BEGIN
         IF get_values = 1
         THEN
            separator := ''''''''' || ';
         ELSE
            separator := '';
         END IF;

         LOOP
            FETCH column_info
             INTO column_name;

            EXIT WHEN column_info%NOTFOUND;
            tempsql := tempsql || separator || column_name;

            IF get_values = 1
            THEN
               separator := ' || '''''', '''''' || ';
            ELSE
               separator := ', ';
            END IF;
         END LOOP;

         IF get_values = 1
         THEN
            tempsql := tempsql || ' || ''''''''';
         END IF;

         RETURN tempsql;
      END;
   END;

   PROCEDURE makeinserts(
      source_table      VARCHAR2
    , constraint_cols   VARCHAR2
    , constraint_vals   VARCHAR2
   )
   AS
   BEGIN
      DECLARE
         basesql               VARCHAR2(4000);
         extractsql            VARCHAR2(4000);
         tempsql               VARCHAR2(4000);
         valuelist             VARCHAR2(4000);
         childconstraint_vals  VARCHAR2(4000);
      BEGIN
         SELECT makeparamlist(CURSOR(SELECT column_name
                                       FROM user_tab_columns
                                      WHERE table_name = source_table), 0)
           INTO tempsql
           FROM DUAL;

         basesql := 'INSERT INTO ' || source_table || '(' || tempsql || ') VALUES (';

         SELECT makeparamlist(CURSOR(SELECT column_name
                                       FROM user_tab_columns
                                      WHERE table_name = source_table), 1)
           INTO tempsql
           FROM DUAL;

         extractsql := 'SELECT ' || tempsql || ' FROM ' || source_table 
                       || ' WHERE (' || constraint_cols || ') = (SELECT ' 
                       || constraint_vals || ' FROM DUAL)';

         EXECUTE IMMEDIATE extractsql
                      INTO valuelist;

         -- This prints out the insert statement for the root row
         DBMS_OUTPUT.put_line(basesql || valuelist || ');');

         -- Now we construct the constraint_vals parameter for subsequent calls:
         SELECT makeparamlist(CURSOR(  SELECT column_name
                                         FROM user_cons_columns ucc
                                            , user_constraints uc
                                        WHERE uc.table_name = source_table
                                          AND ucc.constraint_name = uc.constraint_name
                                     ORDER BY position)
                             , 1)
           INTO tempsql
           FROM DUAL;

         extractsql := 'SELECT ' || tempsql || ' FROM ' || source_table 
                       || ' WHERE ' || constraint_cols || ' = ' || constraint_vals;

         EXECUTE IMMEDIATE extractsql
                      INTO childconstraint_vals;

         childconstraint_vals := childconstraint_vals;

-- Now iterate over the dependent tables for this table
-- Cursor on this statement:
--    SELECT uc.table_name child_table, uc.constraint_name fk_name
--      FROM user_constraints uc
--         , user_constraints ucp
--     WHERE ucp.table_name = source_table
--      AND uc.r_constraint_name = ucp.constraint_name;

         --   For each table in that statement, find the foreign key 
         --   columns that correspond to the rows
         --   in the parent table
         --  SELECT column_name
         --    FROM user_cons_columns
         --   WHERE constraint_name = fk_name
         --ORDER BY POSITION;

         -- Pass that columns into makeparamlist above to create 
         -- the constraint_cols argument of the call below:

         -- makeinserts(child_table, ChildConstraint_cols, childconstrain_vals);
      END;
   END;
END data_extractor;
Steve Broberg
Thanks for this excellent answer! I'll need some time to work through this, but I think it contains everything I need to get to my desired solution by myself.
Karl Bartel
+1 for all the effort, if nothing else!
DCookie