views:

58

answers:

2

Currently I am using Ruby 1.9.1 and the 'ruby-mysql' gem, which unlike the 'mysql' gem is written in ruby only. This is pretty slow actually, as it seems to insert at a rate of almost 1 per second (SLOOOOOWWWWWW). And I have a lot of inserts to make too, its pretty much what this script does ultamitely. I am using just 1 connection (since I am using just one thread). I am hoping to speed things up by creating a fiber that will

  1. create a new DB connection
  2. insert 1-3 records
  3. close the DB connection

I would imagine launching 20-50 of these would greatly increase DB throughput. Am I correct to go along this route? I feel that this is the best option, as opposed to refactoring all of my DB code since the syntax is a quite a bit different for the default mysql driver :(

A: 

Why are you closing a connection after a few inserts? Your script should be:

  1. Open a connection
  2. Write however many records you need
  3. Close connection

Transactions can complicate this.

If you are doing non-transactional inserts, absolutely use one connection and write each record one at a time.

If you are using transactions and writing a truly huge data set (ie millions of rows) then your database may need to be tuned so it has sufficient temporary storage to deal with that.

Basically if a single insert is taking a second it suggests to me one or more of:

  1. You foreign keys aren't indexed so the referential integrity checks on the insert are woefully inefficient; or
  2. Excessive connection opening and closing.

Now all that being said, having multiple "workers" doing inserts can lead to useful increases in overall throughput but at 1 second per insert, that isn't your problem. You need to find what your problem is.

With truly huge batch inserts it may also be worth disabling check constraints, foreign keys and indices, doing all the inserts and then re-enabling them.

cletus
I never said I was closing the connection after each insert... (which I'm not btw). FKs are indexed, this was default from MySQL Workbench. There isn't excessive connection opening/closing. It is only done once. The bottleneck is clearly with the ruby driver since it isn't the default C wrapper, but a pure ruby driver.
Zombies
Oh, and there was a transaction in it, I took it out and only noticed very minor improvements. I suppose now that I think about it I am mostly wondering if Ruby's fibers would effecient or just as slow. And I don't think ruby has any support for thread pooling either, making this kind of difficult.
Zombies
A: 

Since you are using ruby1.9.1, just use Threads. They are no longer green. Fibers are not going to help as they are too light weight for multiple, concurrent DB connections.

Zombies