views:

642

answers:

10

(Database: Oracle 10G R2)

It takes 1 minute to insert 100,000 records into a table. But if the table already contains some records (400K), then it takes 4 minutes and 12 seconds; also CPU-wait jumps up and “Free Buffer Waits” become really high (from dbconsole).

Do you know what’s happing here? Is this because of frequent table extents? The extent size for these tables is 1,048,576 bytes. I have a feeling DB is trying to extend the table storage.

I am really confused about this. So any help would be great!


This is the insert statement:

begin
  for i in 1 .. 100000 loop
    insert into customer
                (id, business_name, address1,
                 address2, city,
                 zip, state, country, fax,
                 phone, email
                )
         values (customer_seq.nextval, dbms_random.string ('A', 20), dbms_random.string ('A', 20),
                 dbms_random.string ('A', 20), dbms_random.string ('A', 20),
                 trunc (dbms_random.value (10000, 99999)), 'CA', 'US', '798-779-7987',
                 '798-779-7987', '[email protected]'
                );
  end loop;
end;


Here dstat output (CPU, IO, MEMORY, NET) for :

  1. Empty Table inserts: http://pastebin.com/f40f50dbb
  2. Table with 400K records: http://pastebin.com/f48d8ebc7


Output from v$buffer_pool_statistics


ID:                       3
NAME:                     DEFAULT
BLOCK_SIZE:               8192
SET_MSIZE:                4446
CNUM_REPL:                4446
CNUM_WRITE:               0
CNUM_SET:                 4446
BUF_GOT:                  1407656
SUM_WRITE:                1244533
SUM_SCAN:                 0
FREE_BUFFER_WAIT:         93314
WRITE_COMPLETE_WAIT:      832
BUFFER_BUSY_WAIT:         788
FREE_BUFFER_INSPECTED:    2141883
DIRTY_BUFFERS_INSPECTED:  1030570
DB_BLOCK_CHANGE:          44445969
DB_BLOCK_GETS:            44866836
CONSISTENT_GETS:          8195371
PHYSICAL_READS:           930646
PHYSICAL_WRITES:          1244533


UPDATE

I dropped indexes off this table and performance improved drastically even when inserting 100K into 600K records table (which took 47 seconds with no CPU wait - see dstat output http://pastebin.com/fbaccb10 ) .

+1  A: 

Sorted inserts always take longer the more entries there are in the table.

Spencer Ruport
A: 

i think it has to do with the extending the internal structure of the file, as well as building database indexes for the added information - i believe the database arranges the data in a non-linear fashion that helps speed up data retrieval on selects

zaczap
+3  A: 

Not sure if this is the same in Oracle, but in SQL Server the first thing I'd check is how many indexes you have on the table. If it's a lot the DB has to do a lot of work reindexing the table as records are inserted. It's more difficult to reindex 500k rows than 100k.

Kyle West
You're right. I did test this theory by dropping indexes and now I don't see CPU Wait. See my last update to the description that has more details.
dt
+1  A: 

Could be building indexes or extending table storage. Without more info it's a bit hard to tell which for sure.

Mitch Wheat
A: 

First of all I don't know a great deal about databases, but wouldn't it be faster doing a bulk import, such as from a text file vs lots of individual insert statements.

benPearce
A: 

What is the source of your insert rows? Are you reading from the database for the insert rows? How often are you committing?

Is Redo and Undo on the same disk as the table and/or indexes?

Why not just make the next extent bigger and run another test or create a new table with an initial extent large enough to hold 1,000,000 rows and test?

What kind of disk are you using? Is this a "real" server or just a PC?

Angelo Marcotullio
A>Q1: I am doing simple inserts in a loop (see the insert statement which I added now) without commits in between.A>Q5: I don't know how to extend the table storage. I think this is automatically done by Oracle 10G version.A>Q6: this is a Linux server.
dt
A: 

The indices are some form of tree, which means the time to insert a record is going to be O(log n), where n is the size of the tree (≈ number of rows for the standard unique index).

The fastest way to insert them is going to be dropping/disabling the index during the insert and recreating it after, as you've already found.

derobert
A: 

Even with indexes, 4 minutes to insert 100,000 records seems like a problem to me.

If this database has I/O problems, you haven't fixed them and they will appear again. I would recommend that you identify the root cause.

If you post the index DDL, I'll time it for a comparison.


I added indexes on id and business_name. Doing 10 iterations in a loop, the average time per 100,000 rows was 25 seconds. This was on my home PC/server all running on a single disk.

Angelo Marcotullio
Thanks for trying out. At least the I/O problem should have been consistent for inserting first 100K and Last 100K but it's not and that's why I was confused. I still have not figured out the root cause though. I don't understand why I/O would become so problematic for latter case.
dt
A: 

You don't say which columns are indexed. If you had indexes on fax, phone or email, you would have had a LOT of duplicates (ie every row). Oracle 'pretends' to have non-unique indexes. In reality every index entry is unique with the rowid of the actual table row being the deciding factor. The rowid is made up of the file/block/record.

It is possible that, once you hit a certain number of records, the new ones were getting rowids which meant that had to be fitted into the middle of existing indexes with a lot of index re-writing going on.

If you supply full table and index creation statements, others would be able to reproduce the experience which would have allowed for more evidence based responses.

Gary
A: 

Another trick to improve performance is to turn on or set the cache higher on your sequence(customer_seq). This will allow oracle to allocate the sequence into memory instead of hitting the object for each insert.

Be careful with this one though. In some situations this will cause gaps your sequence to have gaps between values.

More information here: Oracle/PLSQL: Sequences (Autonumber)

Thanks Matt.This is true that caching the sequence improves a performance. In fact, in one of the earlier optimizations we were able to get 30% performance improvement by caching the sequence, which was a real surprise.
dt