views:

304

answers:

1

Suppose I have a custom file format, which can be analogous to N tables. Let's pick 3. I could transform the file, writing a custom load wrapper to fill 3 database tables.

But suppose for space and resource constraints, I can't store all of this in the tablespace.

Can I use Oracle Preprocessor for External Tables to transform the custom file three different ways?

The examples of use I have read give gzip'd text files an example. But this is a one-to-one file-to-table relationship, with only one transform.

I have a single file with N possible extractions of data.

  • Would I need to define N external tables, each referencing a different program?
  • If I map three tables to the same file, how will this affect performance? (Access is mostly or all reads, few or no writes).

Also, what format does the standard output of my preprocessor have to be? Must it be CSV, or are there ways to configure the external table driver?

A: 

"If I map three tables to the same file, how will this affect performance? (Access is mostly or all reads, few or no writes"

There should be little or no difference between three sessions accessing the same file through one external table definition or three external table definitions. External tables aren't cached by the database (might be by the file system or disk), so any access is purely physical reads. Depending on the pre-processor program, there might be some level of serialization there (or you may use a pre-processor program to impose serialization).

Performance-wise, you'd be better for a single session to scan the external file/table and load it into one or more database tables. The other sessions read it from there and it is cached in the SGA. Also, you can index a database table so you don't have to read it all.

You may be able to use multi-table inserts to load multiple database tables from a single external table definition in a single pass.

"what format does the standard output of my preprocessor have to be? Must it be CSV, or are there ways to configure the external table driver?"

It pretty much follows SQL*Loader, and both are in the Utilities manual. You can use fixed format or other delimiters.

Would I need to define N external tables, each referencing a different program?

Depends on how the data is interleaved. Ignoring pre-processors, you can have different external tables pulling different columns from the same file or use the LOAD WHEN clause to determine which records to include or exclude.

Gary