views:

97

answers:

5

Hi All

I have batch process when i need to update my db table, around 100000-500000 rows, from uploaded CVS file. Normally it takes 20-30 minutes, sometimes longer.

What is the best way to do ? any good practice on that ? Any suggest would be appreciated

Thanks.

+1  A: 

If you're doing a lot of inserts, are you doing bulk inserts? i.e. like this:

INSERT INTO table (col1 col2) VALUES (val1a, val2a), (val1b, val2b), (....

That will dramatically speed up inserts.

Another thing you can do is disable indexing while you make the changes, then let it rebuild the indexes in one go when you're finished.

A bit more detail about what you're doing and you might get more ideas

Greg
+1  A: 

The PEAR has a package called Benchmark has a Benchmark_Profiler class that can help you find the slowest section of your code so you can optimize.

John Downey
+1  A: 

We had a feature like that in a big application. We had the issue of inserting millions of rows from a csv into a table with 9 indexes. After lots of refactoring we found the ideal way to insert the data was to load it into a [temporary] table with the mysql LOAD DATA INFILE command, do the transformations there and copy the result with multiple insert queries into the actual table (INSERT INTO ... SELECT FROM) processing only 50k lines or so with each query (which performed better than issuing a single insert but YMMV).

soulmerge
Damn. You beat me to it. :) +1
Tomalak
+7  A: 
Tomalak
It now about mysql performace, in batch process i have to update other related tables, its not insert but update, status of applications, payment transactions, it really takes long time, coz i have to do some code logic, like, create some records in related tables, notify customers(put email messages to mail queue). So im thinkign to do it with php exec, i just run as background process and let it take its time to finish the job. I cant do it with cron, coz this is under user control, A user click process buttons and later on can check logs to see process status.Thanks Tomalak for your reply.
shuxer
It now about mysql performace = It's not about mysql performace
shuxer
I see. I was assuming inefficiency somewhere, but if it's a lot of work than it takes some time, naturally. Good luck! :)
Tomalak
A: 

I cant do it with cron, coz this is under user control, A user click process buttons and later on can check logs to see process status

When the user presses said button, set a flag in a table in the database. Then have your cron job check for this flag. If it's there, start processing, otherwise don't. I applicable, you could use the same table to post some kind of status update (eg. xx% done), so the user has some feedback about the progress.

troelskn