tags:

views:

7614

answers:

8

What is the fastest way to do Bulk insert to Oracle using .NET? I need to transfer about 160K records using .NET to Oracle. Currently, I'm using insert statement and execute it 160K times.It takes about 25 minutes to complete. The source data is stored in a DataTable, as a result of query from another database (MySQL),

Is there any better way to do this?

EDIT : I'm currently using System.Data.OracleClient, but willing to accept solutions using another provider (ODP.NET, DevArt, etc..)

+3  A: 

You should not commit every insert because committing takes a lot of time.

Which provider do you use to connect your .NET app to the oracle database? Do you use ODP.NET or the Devart provider (aka as corelab provider) or do you use the Microsoft provider for Oracle (System.Data.OracleClient)?

Theo
I've edited the question to answer your comment
Salamander2007
+3  A: 

A really fast way to solve this problem is to make a database link from the Oracle database to the MySQL database. You can create database links to non-Oracle databases. After you have created the database link you can retrieve your data from the MySQL database with a ... create table mydata as select * from ... statement. This is called heterogeneous connectivity. This way you don't have to do anything in your .net application to move the data.

Another way is to use ODP.NET. In ODP.NET you can use the OracleBulkCopy-class.

But I don't think that inserting 160k records in an Oracle table with System.Data.OracleClient should take 25 minutes. I think you commit too many times. And do you bind your values to the insert statement with parameters or do you concatenate your values. Binding is much faster.

Theo
Deleted my Answer because this is a much better solution.
Rob Stevenson-Leggett
I actually has setup the dblink in our dev environemnt, but when it comes to production, it turns out that the oracle and MySQL didn't even resides in the same network, hence my quick and dirty workaround using .net uploader. I really need to look at that OracleBulkCopy. Any pointer?
Salamander2007
+5  A: 

The solution of Rob Stevenson-Legget is slow because he doesn't bind his values but he uses string.Format( ).

When you ask Oracle to execute a sql statement it starts with calculating the has value of this statement. After that it looks in a hash table whether it already knows this statement. If it already knows it statement it can retrieve its execution path from this hash table and execute this statement really fast because Oracle has executed this statement before. This is called the library cache and it doesn't work properly if you don't bind your sql statements.

For example don't do:

int n;

    for (n = 0; n < 100000; n ++)
    {
      mycommand.CommandText = String.Format("INSERT INTO [MyTable] ([MyId]) VALUES({0})", n + 1);
      mycommand.ExecuteNonQuery();
    }

but do:

      OracleParameter myparam = new OracleParameter();
      int n;

      mycommand.CommandText = "INSERT INTO [MyTable] ([MyId]) VALUES(?)";
      mycommand.Parameters.Add(myparam);

      for (n = 0; n < 100000; n ++)
      {
        myparam.Value = n + 1;
        mycommand.ExecuteNonQuery();
      }

Not using parameters can also cause sql injection.

Theo
Deleted my answer because Theo is right with the DB link.
Rob Stevenson-Leggett
A: 

You can use the .Net bulk load API (SQLBulkCopy). This Stackoverflow posting has a code sample of its usage. It's to SQL Server, but uploading to Oracle using SqlBulkCopy shouldn't be radically different.

ConcernedOfTunbridgeWells
You can't use SQLBulkCopy to load into Oracle, only SQL Server. However if you're using ODP.NET, Oracle has it's own API, OracleBulkCopy
Ted Elliott
+7  A: 

I'm loading 50,000 records in 15 or so seconds using Array Binding in ODP.NET

It works by repeatedly invoking a stored procedure you specify (and in which you can do updates/inserts/deletes), but it passes the multiple parameter values from .NET to the database in bulk.

Instead of specifying a single value for each parameter to the stored procedure you specify an array of values for each parameter.

Oracle passes the parameter arrays from .NET to the database in one go, and then repeatedly invokes the stored procedure you specify using the parameter values you specified.

http://www.oracle.com/technology/sample_code/tech/windows/odpnet/howto/arraybind/index.html

/Damian

Damian Mehers
+1 This is the right answer having a .NET solution.
Christian13467
That works great with arrays but he is writing a datatable back to the database. How does one use array bindings to a datatable in ODP.Net?
mcauthorn
For each row in the data table the individual column values need to be added to the array of values (one value per row) for that column, and then the single insert using array binding must be called.
Damian Mehers
+2  A: 

Oracle says (http://www.oracle.com/technology/products/database/utilities/htdocs/sql_loader_overview.html)

SQL*Loader is the primary method for quickly populating Oracle tables with data from external files

My experience is that their loader loads their tables faster than anything else.

S.Lott
Not sure that it has a .NET API though - I think the original question was re .NET. Also it wasn't sufficient for my purposes because it just does inserts - I wanted to UPDATE/INSERT depending on whether records existed already. Might be sufficient for the questioner though.
Damian Mehers
SQL*Loader is an oracle data loading tool. It can handle large amount of data stored in csv, txt or other files. Export your data from mysql into csv format and write a loader script to pump the data into oracle.
Christian13467
A: 

To follow up on Theo's suggestion with my findings (apologies - I don't currently have enough reputation to post this as a comment)

First, this is how to use several named parameters:

String commandString = "INSERT INTO Users (Name, Desk, UpdateTime) VALUES (:Name, :Desk, :UpdateTime)";
using (OracleCommand command = new OracleCommand(commandString, _connection, _transaction))
{
    command.Parameters.Add("Name", OracleType.VarChar, 50).Value = strategy;
    command.Parameters.Add("Desk", OracleType.VarChar, 50).Value = deskName ?? OracleString.Null;
    command.Parameters.Add("UpdateTime", OracleType.DateTime).Value = updated;
    command.ExecuteNonQuery();
}

However, I saw no variation in speed between:

  • constructing a new commandString for each row (String.Format)
  • constructing a now parameterized commandString for each row
  • using a single commandString and changing the parameters

I'm using System.Data.OracleClient, deleting and inserting 2500 rows inside a transaction

Neil
+1  A: 

I recently discovered a specialized class that's awesome for a bulk insert (ODP.NET). Oracle.DataAccess.Client.OracleBulkCopy! It takes a datatable as a parameter, then you call WriteTOServer method...it is very fast and effective, good luck!!

Tarik