views:

641

answers:

3

So, I often have to load data into holding tables to run some data validation checks and then return the results. Normally, I create the holding table, then a sqlldr control file and load the data into the table, then I run my queries. Is there any reason I should be using external tables for thing instead? In what way will they make my life easier?

+6  A: 

The big advantage of external tables is that we can query them from inside the database using SQL. So we can just run the validation checks as SELECT statements without the need for a holding table. Similarly if we need to do some manipulation of the loaded data it is almost always easier to do this with SQL rather than SQLLDR commands. We can also manage data loads with DBMS_JOB/DBMS_SCHEDULER routines, which further cuts down the need for shell scripts and cron jobs.

However, if you already have a mature and stable process using SQLLDR then I concede it is unlikely you would realise tremendous benefits from porting to external tables.

There are also some cases - especially if you are loading millions of rows - where the SQLLDR approach may be considerably faster. Howver, the difference will not be as marked with more recent versions of the database. I fully expect that SQLLDR will eventually be deprecated in favour of external tables.

APC
+1, all good points.
DCookie
+1  A: 

If you look at the External Table syntax, it looks suspiciously like SQL*Loader control file syntax :-)

If your external table is going to be repeatedly used in multiple queries it might be faster to load a table (as you're doing now) rather than rescan your external table for each query. As @APC notes, Oracle is making improvements in them, so depending on your DB version YMMV.

DCookie
Suspiciously similar, though not equivalent. :) http://stackoverflow.com/questions/898872/oracle-external-tables-advanced-flat-file-layout
Adam Paynter
Good point. I wonder how much SQL*Loader code was behind at least the initial go at external tables...
DCookie
+1  A: 

I would use external tables for their flexibility.

It's easier to modify the data source on them to be a different file alter table ... location ('my_file.txt1','myfile.txt2')

You can do multitable inserts, merges, run it through a pipelined function etc...

Parallel query is easier ...

It also establishes dependencies better ...

The code is stored in the database so it's automatically backed up ...

David Aldridge