views:

51

answers:

2

I'm upgrading a script to a new version with a hole new database layout. The upgrade starts fine but slowly starts taking more and more time for the same query. The query in question is the following:

SELECT nuser.user_id, nfriend.user_id AS friend_user_id, f.time
FROM oldtable_friends AS f
JOIN oldtable_user AS u ON ( u.user = f.user )
JOIN newtable_user AS nuser ON ( nuser.upgrade_user_id = u.id )
JOIN oldtable_user AS uf ON ( uf.user = f.friend )
JOIN newtable_user AS nfriend ON ( nfriend.upgrade_user_id = uf.id )
LIMIT 200
OFFSET 355600

The OFFSET here varies of course as data is fetched in batches of 200 records.

oldtable_friends has about 2 million records.

oldtable_user and newtable_user have around 70,000 records.

That query executes very fast at first but slowly starts to add up and after a couple of hours it takes about 30 seconds to execute. Those tables don't change at all while the script is upgrading so I'm not sure where is the bottleneck. It seems that the query slows down as the OFFSET variable grows up.

Here is the EXPLAIN:

+----+-------------+---------+--------+-----------------+-----------------+---------+-----------------------------------+-------+-------------+
| id | select_type | table   | type   | possible_keys   | key             | key_len | ref                               | rows  | Extra       |
+----+-------------+---------+--------+-----------------+-----------------+---------+-----------------------------------+-------+-------------+
|  1 | SIMPLE      | nuser   | ALL    | upgrade_user_id | NULL            | NULL    | NULL                              | 71638 |             | 
|  1 | SIMPLE      | u       | eq_ref | PRIMARY,user    | PRIMARY         | 4       | database.nuser.upgrade_user_id |     1 |             | 
|  1 | SIMPLE      | f       | ref    | user,friend     | user            | 77      | database.u.user                |    20 |             | 
|  1 | SIMPLE      | uf      | eq_ref | PRIMARY,user    | user            | 77      | database.f.friend              |     1 |             | 
|  1 | SIMPLE      | nfriend | ref    | upgrade_user_id | upgrade_user_id | 5       | database.uf.id                 |     1 | Using where | 
+----+-------------+---------+--------+-----------------+-----------------+---------+-----------------------------------+-------+-------------+

All the tables have indexes on the fields being used. I can provide the tables structure if needed. I've been playing around a bit with MySQL configuration options and although it improved a bit, it wasn't much. Any suggestions?

+1  A: 

Take a look at ORDER BY … LIMIT Performance Optimization for completeness although you don't appear to be doing anything wrong as such.

Large OFFSETs are slow. There's no getting around that after a certain point.

You say you're batching 200 records at a time. Why not just do one query and read through all 70,000 rows? That will in fact be much faster.

cletus
A: 

@cletus: There are almost 2 million records, but it's still a good idea. It takes almost the same for MySQL to get me 200 or 20,000 rows from that query so I think it should work.

Unfortunately when I try to do that in my PHP script I get a "Prematue end of scripts header". After a lot of debugging I'm sure it's not a PHP memory limit or max execution time but it still happens. I'm able to run that query just fine through console and sometimes through PHPMyAdmin but not in my script. I've found out that my script runs the query when there is a small OFFSET (300,000) but if I increase the OFFSET to 700,000 or 1,500,000 it throws the Internal Server Error. So my question is: is there any sort of timeout or something on mysql_query() or mysql_fetch_array() or else?

BTW: not sure if I should post this as a new question.

Martin
The problem seemed to be with a default FCGI configuration option that makes it time out after 20 seconds. You can add it to your fcgid.conf:IPCCommTimeout 60
Martin