views:

265

answers:

2

So, I have this java based data trasformation / masking tool, which I wanted to test out on Oracle 10g. The good part with Oracle 10g is that you get a load of sample schemas with half a million records in some. The schemas are : SH, OE, HR, IX and etc. So, I installed 10g, found out that the installation scripts are under ORACLE_HOME/demo/scripts.

I customized these scripts a bit to run in batch mode. That solves one half of my requirement - to create source data for my testing my data transformation software. The second half of the requirement is that I create the same schemas under different names (TR_HR, TR_OE and so on...) without any data. These schemas would represent my target schemas. So, in short, my software would pick up data from a table in a schema and load it up in to the same table in a different schema.

Now, I have two issues in creating my target schema and emptying it.

  • I would like this in a batch job. But the oracle scripts you get, the sample schema names are not configurable. So, I tried creating a script, replacing OE with TR_OE, HR with TR_HR and so on. However, this approach is kind of irritating coz the sample schemas are kind of complicated in the way they are created; Oracle creates synonyms, views, materialized views, data types and lot of weird stuff.
  • I would like the target schemas (TR_HR, TR_OE,...) to be empty. But some of the schemas have circular references, which would not allow me to delete data. The only work around seems to be removing certain foreign keys, deleting data and then adding the constraints back.

Is there any easy way to all this, without all this fuss? I would need a complicated data set for my testing (complicated as in tables with triggers, multiple hierarchies.. for instance.. a child table that has children up to 5 levels, a parent table that refers to an IOT table and an IOT table that refers to a non-IOT table etc..). The sample schemas are just about perfect from a data set perspective. The only challenge I see is in automating this whole process of loading up the source schemas, and then creating the target schemas and emptying them. Appreciate your help and suggestions.


UPDATE

The main script that you are required to run for manually installing oracle sample schemas is mkplug.sql. Here is the line that loads the schemas up from a dmp file:

host imp "'sys/&&password_sys AS SYSDBA'" transport_tablespace=y file=&imp_file log=&imp_logfile datafiles='&datafile' tablespaces=EXAMPLE tts_owners=hr,oe,pm,ix,sh

Well, I tried modifying this line (after patching up path related issues on mkplug.sql and all other sql files) to this:

host imp "'sys/&&password_sys AS SYSDBA'" rows=n transport_tablespace=y file=&imp_file log=&imp_logfile datafiles='&datafile' tablespaces=EXAMPLE tts_owners=hr,oe,pm,ix,sh

And... it did NOT help. The schema got created with row data, despite rows=n attribute :(

+1  A: 

Here is an anonymos block which - for a given schema - disables triggers and foreign keys, truncates all the tables and then re-enables triggers and foreign keys. It uses truncate for speed but obviously this means no rollback: so be careful which schema name you supply! It's easy enough to convert that call into a delete from statement if you prefer.

The script is a fine example of cut'n'paste programming, and would no doubt benefit from some refactoring to remove the repetition.

begin
    << dis_triggers >>
    for trgs in ( select owner, trigger_name 
                  from all_triggers 
                  where table_owner = '&&schema_name' )
    loop
        execute immediate 'alter trigger '||trgs.owner||'.'||trgs.trigger_name
                                ||' disable';
    end loop dis_triggers;

    << dis_fkeys >>
    for fkeys in ( select owner, table_name, constraint_name
                  from all_constraints 
                  where owner = '&&schema_name' 
                  and constraint_type = 'R')
    loop
        execute immediate 'alter table '||fkeys.owner||'.'||fkeys.table_name
                                ||' disable constraint '||fkeys.constraint_name;
    end loop dis_fkeys;

    << zap_tables >>
    for tabs in ( select owner, table_name
                  from all_tables 
                  where owner = '&&schema_name' )
    loop
        execute immediate 'truncate table '||tabs.owner||'.'||tabs.table_name
                                ||' reuse storage';
    end loop zap_tables;

    << en_fkeys >>
    for fkeys in ( select owner, table_name, constraint_name
                  from all_constraints 
                  where owner = '&&schema_name' 
                  and constraint_type = 'R')
    loop
        execute immediate 'alter table '||fkeys.owner||'.'||fkeys.table_name
                                ||' enable constraint '||fkeys.constraint_name;
    end loop en_fkeys;

    << en_triggers >>
    for trgs in ( select owner, trigger_name 
                  from all_triggers 
                  where table_owner = '&&schema_name' )
    loop
        execute immediate 'alter trigger '||trgs.owner||'.'||trgs.trigger_name
                                ||' enable';
    end loop en_triggers;

end;
/
APC
@APC I think I can give this a shot. Do you know of any free to use public data sets that have a complicated table structure (things like IOT tables, non IOT tables, depth in a parent's child tables, IOTs referencing non-IOTs and viceversa...etc)?
Jay
+5  A: 

Since you're already familiar with exp/imp (or expdp/impdp) from the Oracle scripts that use the .dmp file, why not just:

  • Create the empty TR_xxx schemas
  • Populate the TR_xxx schema from the xxx .dmp file with the FROMUSER/TOUSER options and ROWS=N (similar options exist for expdp/impdp)

[Edit after reading your comment about the transportable tablespaces]

I didn't know that the Oracle scripts were using transportable tablespaces and that multiple schemas were being imported from a single file. This is probably the most straightforward way to create your new empty TR schemas:

  • Start with the standard, populated database built with the Oracle scripts
  • Create no-data export files on a
    schema-by-schema basis (OE shown) by:

    exp sys/&&password_sys AS SYSDBA file=oe_nodata.dmp log=oe_nodata_exp.log owner=OE rows=N grants=N

    (You should only have to do this once and this dmp file can be reused)

Now, your script should:

  • Drop any TR_ users with the CASCADE
    option
  • Re-create the TR_ users
  • Populate the schema objects (OE
    shown) by:

    host imp "'sys/&&password_sys AS SYSDBA'" file=oe_nodata.dmp log=tr_oe_imp.log fromuser=OE touser=TR_OE

dpbradley
the issue is with creating the TR_XXX schemas. Like you said, I would have to load up my TR_xxx schemas from xxx.dmp files and then delete data in them. Deleting the data is the hardest part - some schemas have circular references, it would require me to disable foreign key constraints and enable them once data is deleted. In short, this is a very painful approach.
Jay
The ROWS=N option will create only the Oracle objects (tables, indexes, views, etc) but not insert any data. By "Create the empty schema", I meant just create the Oracle user/schema with no objects - i.e. "CREATE USER TR_OE IDENTIFIED BY mypassword"
dpbradley
ROWS=N sounds cool, I will give this a shot and post and update
Jay
ROWS=N doesn't work :( - has this got something to do with transport tablespaces?
Jay