views:

190

answers:

2

I have a data archiving process in java that moves data between db2 and sybase. FYI - This is not done through any import/export process because there are several conditions on each table that are available on run-time and so this process is developed in java.

Right now I have single DatabaseReader and DatabaseWriter defined for each source and destination combination so that data is moved in multiple threads. I guess I wanted to expand this further where I can have Multiple DatabaseReaders and Multiple DatabaseWriters defined for each source and destination combination.

So, for example if the source data is about 100 rows and I defined 10 readers and 10 writer, each reader will read 10 rows and give them to the writer. I hope process will give me extreme performance depending on the resources available on the server [CPU, Memory etc].

But I guess the problem is these source tables do not have primary keys and it is extremely difficult to grab rows in multiple sets.

Oracle provides rownum concept and i guess the life is much simpler there....but how about db2? How can I achieve this behavior with db2? Is there a way to say fetch first 10 records and then fetch next 10 records and so on?

Any suggestions / ideas ?

Db2 Version - DB2 v8.1.0.144 Fix Pack Num - 16 Linux

A: 

DB2 does indeed support a rownum concept. Not knowing your database schema, this is an example:

SELECT *
FROM (
  SELECT rownumber() OVER (order by column1, column2) AS number, column1, column2
  FROM mytable
) temp
WHERE number BETWEEN 20 AND 30

The subquery here obtains every row in your table, and the rownumber function is assigned based on the order specified. The "outside" query can use the rownumber to select the rows that you actually want for your batches.

Scott Jones
Bad Idea! The rownunber function calculates a row number based on the currently selected rows -- it is calculated during the select processing so it is not repeatable and involves extra processing.
James Anderson
A: 

If I understand correctly you are just trying to write to the target table in batches of 10 rows. There is no need to batch the reads into 10 rows.

Just fire of the select statement then keep reading until you run out of data. On every 10th row you can invoke the writer. However most jdbc clients will perform this sort of optimisation under the covers so this will lprobably be no better than inserting rows one at a time.

I have my doubts that a nieve multithreaded implementation will speed things up as the target database engine is doing most of the work anyway and it will be constrained by log writes and locking which multithreading does not help (or makes worse!).

As far as I can see a simple single threaded "select"/"insert" process will easily outperform your multithreaded program. There are other ways to optimise this:-

  1. Tune you database clients so that network operations are batched up.
  2. Examine you "COMMIT" processing so that you commit after every 100 or so inserts.
James Anderson