views:

34

answers:

2

One of the portion of my site requires bulk insert, it takes around 40 mins for innodb to load that file into database. I have been digging around the web and found few things.

  1. innodb_autoinc_lock_mode=2 (It wont generate consecutive keys)
  2. UNIQUE_CHECKS=0; (disable unique key checks)
  3. FOREIGN_KEY_CHECKS=0 (disable foreign key checks)
  4. --log_bin=OFF turn off binary log used for replication

Problem

I want to set first 3 options for just one session i.e. during bulk insert. The first option does not work mysql says unknown system variable 'innodb_autoinc_lock_mode'. I am using MySQL 5.0.4

The last option, I would like to turn it off but I am wondering what if I need replication later will it just start working if I turn it on again?

Suggestions

Any other suggestions how to improve bulk inserts/updates for innodb engine? Or please comments on my findings.

Thanks

A: 

To answer the last part of your question, no it won't just start working again; if the inserts are not replicated but subsequent updates are, the result will not be a pretty sight. Disabling foreign and unique keys should be OK, provided you re-enable them afterwards, and deal with any constraint violations.

How often do you have to do this? Can you load smaller datasets more frequently?

Brian Hooper
i have large data set and it runs every second hour. What if I use like this in php mysql_query('UNIQUE_CHECKS=0'); is it disabling unique checks for this session only or is it global?Mostly I have 6000+ records every second hours. Around 45% are update records rest are insert records
jason
Permanently, I would think, as it's a MySQL command. You could reinstate it at the end of the update with a UNIQUE_CHECKS=1 at the end. It's rather mysterious why it takes so long to insert/update 6000 rows, though.
Brian Hooper
+1  A: 

Assuming you are loading the data in a single or few transactions, most of the time is likely to be spent building indexes (depending on the schema of the table).

Do not do a large number of small inserts with autocommit enabled, that will destroy performance with syncs for each commit.

If your table is bigger (or nearly as big as) the innodb buffer pool you are in trouble; a table which can't fit in ram with its indexes cannot be inserted into efficiently, as it will have to do READS to insert. This is so that existing index blocks can be updated.

Remember that disc writes are ok (they are mostly sequential, and you have a battery-backed raid controller, right?), but reads are slow and need to be avoided.

In summary

  • Do the insert in a small number of big-ish transactions, say 10k-100k rows or each. Don't make the transactions too big or you'll exhaust the logs.
  • Get enough ram that your table fits in memory; set the innodb buffer pool appropriately (You are running x86_64, right?)
  • Don't worry about the operation taking a long time, as due to MVCC, your app will be able to operate on the previous versions of the rows assuming it's only reading.

Don't make any of the optimisations listed above, they're probably waste of time (don't take my word for it - benchmark the operation on a test system in your lab with/without those).

Turning unique checks off is actively dangerous as you'll end up with broken data.

MarkR
+1 for good advice and warning about unique checks.Disabling unique checks can actually boost performance a lot particularly in situations where the data is starting to not all fit in memory (since it allows you to use the insert buffer with UNIQUE indexes), but the parent post is right to give a warning. InnoDB will do nothing to reconcile when you turn the checks back on.
Morgan Tocker