views:

108

answers:

4

Hi,

I have a MyISAM table with 2 fields f1 and f2, both Unsigned integers and cannot be null. The table purposely has no primary key but it has an index on f2. The table currently has 320 million rows.

I would like to be able to insert new rows (about 4000 once every week) into this table at a decent speed. However, currently I insert about 4000 rows in 2 minutes. (I am doing this using a text file and the "source" command - The text file contains just INSERT statements into this table). Is there a way in which I can speed up the insert statements? Also, while performing the INSERT statements, will any SELECT/JOIN statements to the same table be affected or slowed down?

Thanks in advance, Tim

+2  A: 

You can bulk up the insert statements from

INSERT INTO table (f1, f2) VALUES ($f1, $f2);
INSERT INTO table (f1, f2) VALUES ($other, $other);
etc...

into

INSERT INTO table (f1, f2) VALUES ($f1, $f2), ($other, $other), etc...

which will reducing parseing overhead somewhat. This may speed things up a little bit. However, don't go too far overboard grouping the inserts as the query is subject to the max_allowed_packet setting.

4000 rows in 2 minutes is still 33 rows per second. That's not too shabby, especially on a huge table where an index has to be updated. You could disable keys on the table prior to doing the insert and then rebuild the key again afterwards with a REPAIR TABLE, but that might take longer, especially with 320 million rows to scan. You'd have to do some benchmarking to see if it's worthwile.

As for SELECTS/JOINS, since you're on MYISAM tables, there's no way to hide the new tables in a transaction until they're all done. Each row will immediately be visible to other session as it's entered, unless you lock the table so you get exclusive access to it for the inserts. But then, you've locked everyone else out while the insert's running.

Marc B
2 other approaches include, load your data into a fresh temp table and insert them into the main table with insert into table select * from temp; or bulk load them with `LOAD DATA INFILE `
nos
A: 

As far as i know, the source-command is the fastest way of doing this. Since the table is MyISAM, the whole table is locked during write actions. So yes, all SELECT-statements are queued up until all inserts/updates/deletes have finished.

elusive
A: 

If the data to load can be accessed by the database, you could use the LOAD DATA INFILE command. As described in the manual:

The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed.

Hope that helps.

rjk
Thanks, I have used LOAD DATA INFILE. I will have the need to insert up to 200,000 rows and this still takes a long time with indexes enabled. So I have tried to ALTER TABLE...DISABLE KEYS, ...LOAD DATA INFILE ...ALTER TABLE...ENABLE KEYS. However the enabling keys for 320 million rows takes a long time. I was thinking if there could be some value to tweak in the .cnf file to make this go faster?
TMM
A: 

@rjk is correct. LOAD DATA INFILE is the fastest way to get data into your table. Few other thoughts.

2 minutes seems long to me for 4k rows. SELECTs block INSERTs in MyISAM and are likely slowing down your inserts. I strongly suggest InnoDB which doesn't have this issue plus better crash recovery, etc. If you must use MyISAM, locking the table before running your inserts may help or you could try INSERT DELAYED which will allow the INSERT statements to return immediately and be processed when the table is free.

Joshua Martell
Hi, thanks for your comment. I would not like to use InnoDB as my system is made up mostly of SELECT queries and I need to perform the INSERTs only once a week, during which time the system will be taken down so there would not even be any concurrent selects. I have used LOAD DATA INFILE but the keys' enabling (after disabling) takes a long time. I would like to try and optimise this key re-enabling now. Thanks
TMM