tags:

views:

1054

answers:

5

I need to insert many small rows rapidly into Oracle. (5 fields).

With MySQL, I break the inserts into groups of 100, then use one insert statement for every group of 100 inserts.

But with Oracle, user feedback is that the mass inserts (anywhere from 1000-30000) are too slow.

Is there a similar trick I can use to speed up the programmatic inserts from Java into Oracle?

A: 

Nowadays MySQL is Oracle so perhaps a simpler solution might be to stay on MySQL...

If not then you should make sure that the transaction is started before you start your group of inserts, once the group is finished then commit the transaction and start a new transaction for the next group of inserts.

Also check for unnecessary index definitions that may be slowing the insert time.

Updated...
Bulk inserting refers to the last step of ETL (Extract Transform Load), so have you considered using a java based ETL tool such as pentaho kettle or talend-studio.

Pentaho describes their Oracle Bulk loading facilities here.

A quick google also shows some preliminary evidence that Talend also has some support for Oracle bulk loading.

crowne
MySQL is obviously not Oracle. The company is, but the software isn't. Besides-- it's the customer's choice. I prefer MySQL but we need to support both.
Will Glass
Just to confirm, it's already all within a single transaction.
Will Glass
The comment about staying on MySQL because it is Oracle was meant to be taken as a joke ... sort-of ... there I go again ...
crowne
Thanks. Probably useful for some, but i'm uploading dynamically calculated computational data generated based on a web request. Needs to be done from Java.
Will Glass
A: 

Multithread it with multiple database connections?

Pablojim
Then you'd have multiple transactions. Sounds icky.
Will Glass
You didn't say it had to be transactional!
Pablojim
+3  A: 

Oracle JDBC batching is described here.

Gary
+1  A: 

You don't stay how you are passing these records to the database. The best way is to use an array, as this allows the use of Oracle's plenty of nifty FORALL bulk operations.

This example package has two procedures. One which populates a collection of T23 records (a table which comprises five numeric columns) and one which bulk inserts records into that table using an array.

SQL> create or replace package p23 as
  2      type t23_nt is table of t23%rowtype;
  3      function pop_array ( p_no in number )
  4          return t23_nt;
  5      procedure ins_table ( p_array in t23_nt );
  6  end p23;
  7  /

Package created.

SQL> create or replace package body p23 as
  2
  3      function pop_array ( p_no in number )
  4          return t23_nt
  5      is
  6          return_value t23_nt;
  7      begin
  8          select level,level,level,level,level
  9          bulk collect into return_value
 10          from dual
 11          connect by level <= p_no;
 12          return return_value;
 13      end pop_array;
 14
 15      procedure ins_table
 16              ( p_array in t23_nt )
 17      is
 18          s_time pls_integer;
 19      begin
 20
 21          s_time := dbms_utility.get_time;
 22
 23          forall r in p_array.first()..p_array.last()
 24              insert into t23
 25              values p_array(r);
 26
 27          dbms_output.put_line('loaded '
 28                  ||to_char(p_array.count())||' recs in '
 29                  ||to_char(dbms_utility.get_time - s_time)
 30                  ||' csecs');
 31      end ins_table;
 32  end p23;
 33  /

Package body created.

SQL>

Here is the output from some sample runs:

SQL> declare
  2      l_array p23.t23_nt;
  3  begin
  4      l_array := p23.pop_array(500);
  5      p23.ins_table(l_array);
  6      l_array := p23.pop_array(1000);
  7      p23.ins_table(l_array);
  8      l_array := p23.pop_array(2500);
  9      p23.ins_table(l_array);
 10      l_array := p23.pop_array(5000);
 11      p23.ins_table(l_array);
 12      l_array := p23.pop_array(10000);
 13      p23.ins_table(l_array);
 14      l_array := p23.pop_array(100000);
 15      p23.ins_table(l_array);
 16  end;
 17  /
loaded 500 recs in 0 csecs
loaded 1000 recs in 0 csecs
loaded 2500 recs in 0 csecs
loaded 5000 recs in 1 csecs
loaded 10000 recs in 1 csecs
loaded 100000 recs in 15 csecs

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*) from t23
  2  /

  COUNT(*)
----------
    119000

SQL>

I think inserting 100,000 records in 0.15 seconds should please all but the most demanding of users. So, the question is, how do you approach your inserts?

APC
Thanks! With jdbc from java, that's my only constraint.
Will Glass
+5  A: 

You can use Spring's DAO module to batch insert many rows.

An example that inserts a collection of Order objects into the database in one update:

public class OrderRepositoryImpl extends SimpleJdbcDaoSupport implements
        OrderRepository {

    private final String saveSql = "INSERT INTO orders(userid, username, coffee, coffeename, amount) "
            + "VALUES(?, ?, ?, ?, ?)";

    public void saveOrders(final Collection<Order> orders) {
        List<Object[]> ordersArgumentList = new ArrayList<Object[]>(orders
                .size());

        Object[] orderArguments;
        for (Order order : orders) {
            orderArguments = new Object[] { order.getUserId(),
                    order.getUserName(), order.getCoffe(),
                    order.getCoffeeName(), order.getAmount() };

            ordersArgumentList.add(orderArguments);
        }

        getSimpleJdbcTemplate().batchUpdate(saveSql, ordersArgumentList);
    }
}
Espen
good to know. I'm not using Spring here, but will bear this in mind.
Will Glass
An advantage with this solution is that it sends all the data at the end as one big package. If you're executing updates for each row, then you send the over the network too. JPA also provides similar batch solutions as this Spring solution.
Espen