tags:

views:

1393

answers:

7

I have a particular PHP page that, for various reasons, needs to save ~200 fields to a database. These are 200 separate insert and/or update statements. Now the obvious thing to do is reduce this number but, like I said, for reasons I won't bother going into I can't do this.

I wasn't expecting this problem. Selects seem reasonably performant in MySQL but inserts/updates aren't (it takes about 15-20 seconds to do this update, which is naturally unacceptable). I've written Java/Oracle systems that can happily do thousands of inserts/updates in the same time (in both cases running local databases; MySQL 5 vs OracleXE).

Now in something like Java or .Net I could quite easily do one of the following:

  1. Write the data to an in-memory write-behind cache (ie it would know how to persist to the database and could do so asynchronously);
  2. Write the data to an in-memory cache and use the PaaS (Persistence as a Service) model ie a listener to the cache would persist the fields; or
  3. Simply start a background process that could persist the data.

The minimal solution is to have a cache that I can simply update, which will separately go and upate the database in its own time (ie it'll return immediately after update the in-memory cache). This can either be a global cache or a session cache (although a global shared cache does appeal in other ways).

Any other solutions to this kind of problem?

+1  A: 

Just update your local cache (hopefully memcached) and then push the write requests through beanstalkd.

Dustin
Thanks for your reply. I've had a look and there are a lot of moving parts: memcached, beanstalkd and assumedly something else to read the work queues and persist the data?
cletus
That is correct. The parts are *really* simple, though and allow you to greatly scale your app. You'll find yourself wanting to put more stuff through the workers (such as sending out emails, updating search engines, etc..). memcached is orthogonal, but should be everywhere anyway.
Dustin
Any suggestions on what to use for the background process? Something will need to read from the work queue and process the requests. I've seen some blogs about people using Ruby with beanstalkd, possibly Java too. I'm not keen on having another server in the mix though.
cletus
I often run stuff on the same machine. I don't write any PHP at all, but if you can run a background process that just sits on a queue and pops stuff off to process it, running it all even on the same machine would be beneficial.
Dustin
A: 

I would suspect a problem with your SQL inserts - it really shouldn't take that long. Would prepared queries help? Does your mysql server need some more memory dedicated to the keyspace? I think some more questions need asked.

DGM
A: 

How are you doing the inserts, are you doing one insert per record

mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');
mysql_query('INSERT INTO tableName VALUES(...)');

or are you using a single query

mysql_query('INSERT INTO tableName VALUES(...),(...),(...),(...)');

The later of the two options is substantially faster, and from experience the first option will cause it to take much longer as PHP must wait for the first query to finish before moving to the second and so on.

Unkwntech
Each field update is either an update if it exists already or an insert if it doesn't. I already know if it exists or not because I've preloaded all the values. This'd solve extensive inserts but the modifications are and will be primarily updates.
cletus
+1  A: 

Look at the statistics for your database while you do the inserts. I'm guessing that one of your updates locks the table and therefor all your statements are queued up and you experience this delay. Another thing to look into is your index creation/updating because the more indices you have on a table, the slower all UPDATE and INSERT statements get.

Another thing is that I think you use MYISAM (table engine) which locks the entire table on UPDATE.I suggest you use INNODB instead. INNODB is slower on SELECT-queries, but faster on INSERT and UPDATE because it only locks the row it's working on and not the entire table.

Till
A gross oversimplification (and possibly inaccurate one) of the performance characteristics of mysql engines. Locking the entire table is probably faster than locking rows (because there's only one lock to take).
MarkR
What? MYISAM locks the entire table, thus making write unavailable during the lock, which leads to a large backlog (= delay). Innodb only locks the row it's working on - other processes can still write on the table.
Till
+1  A: 

You should be able to do 200 inserts relatively quickly, but it will depend on lots of factors. If you are using a transactional engine and doing each one in its own transaction, don't - that creates way too much I/O.

If you are using a non-transactional engine, it's a bit trickier. Using a single multi-row insert is likely to be better as the flushing policy of MySQL means that it won't need to flush its changes after each row.

You really want to be able to reproduce this on your production-spec development box and analyse exactly why it's happening. It should not be difficult to stop.

Of course another possibility is that your inserts are slow because of extreme sized tables or large numbers of indexes - in which case you should scale your database server appropriately. Inserting lots of rows into a table whose indexes don't fit into RAM (or doesn't have RAM correctly configured to be used for caching those indexes) generally gets pretty smelly.

BUT don't try to look for a way of complicating your application when there is a way of easily tuning it instead, keeping the currently algorithm.

MarkR
The "premature optimisation" bit is good advice. The tables aren't overly complicated but I've got a couple of things I can check now (indexes and the database engine).
cletus
+2  A: 

One more solution that you could use (instead of tuning mysql :) ) is to use some JMS server and STOMP connection driver for PHP for write data to database server in a asynchronous manner. ActiveMQ have built-in support for STOMP protocol. And there is StompConnect project which is STOMP proxy for any JMS compilant server (OpenMQ, JBossMQ etc).

dotsid
Also OpenMQ, part of the GlassFish application server (but also runnable without it) is a JMS server which will support Stomp in version 4.4
mjustin
+1  A: 

Hi all

mysql_query('INSERT INTO tableName VALUES(...),(...),(...),(...)')

Above given query statement is better. But we have another solution to improve the performance of insert statement.Follow the following steps.. 1. You just create a csv(comma separeted delimeter file)or simple txt file and write all the data that you want to insert using file writting machanism (like FileOutputStream class in java). 2. use this command

LOAD DATA INFILE 'data.txt' INTO TABLE table2 FIELDS TERMINATED BY '\t';

3 if you are not clear about this command then follow the link given below

http://dev.mysql.com/doc/refman/5.1/en/load-data.html

Anup Kumar Srivastava