Hello,
I'm trying to insert about 500 million rows of garbage data into a database for testing. Right now I have a PHP script looping through a few SELECT/INSERT
statements each inside a TRANSACTION
-- clearly this isn't the best solution. The tables are InnoDB (row-level locking).
I'm wondering if I (properly) fork the process, will this speed up the INSERT
process? At the rate it's going, it will take 140 hours to complete. I'm concerned about two things:
If
INSERT
statements must acquire a write lock, then will it render forking useless, since multiple processes can't write to the same table at the same time?I'm using
SELECT...LAST_INSERT_ID()
(inside aTRANSACTION
). Will this logic break when multiple processes areINSERT
ing into the database? I could create a new database connection for each fork, so I hope this would avoid the problem.How many processes should I be using? The queries themselves are simple, and I have a regular dual-core dev box with 2GB RAM. I set up my InnoDB to use 8 threads (
innodb_thread_concurrency=8
), but I'm not sure if I should be using 8 processes or if this is even a correct way to think about matching.
Thanks for your help!