views:

184

answers:

6

Hi,

I have a php script that runs a mysql query, then loops the result, and in that loop also runs several queries:

    $sqlstr = "SELECT * FROM user_pred WHERE uprType != 2 AND uprTurn=$turn ORDER BY uprUserTeamIdFK";
    $utmres = mysql_query($sqlstr) or trigger_error($termerror = __FILE__." - ".__LINE__.": ".mysql_error());
    while($utmrow = mysql_fetch_array($utmres, MYSQL_ASSOC)) {
// some stuff happens here    
//  echo memory_get_usage() . " - 1241<br/>\n";
        $sqlstr = "UPDATE user_roundscores SET ursUpdDate=NOW(),ursScore=$score WHERE ursUserTeamIdFK=$userteamid";
        if(!mysql_query($sqlstr)) {
            $err_crit++;
            $cLog->WriteLogFile("Failed to UPDATE user_roundscores record for user $userid - teamuserid: $userteamid\n");
            echo "Failed to UPDATE user_roundscores record for user $userid - teamuserid: $userteamid<br>\n";
            break;
        }
    unset($sqlstr);
    //  echo memory_get_usage() . " - 1253<br/>\n";
// some stuff happens here too
}

The update query never fails.

For some reason, between the two calls of memory_get_usage, there is some memory added. Because the big loop runs about 500.000 or more times, in the end it really adds up to alot of memory. Is there anything I'm missing here?
could it herhaps be that the memory is not actually added between the two calls, but at another point in the script?

Edit: some extra info: Before the loop it's at about 5mb, after the loop about 440mb, and every update query adds about 250 bytes. (the rest of the memory gets added at other places in the loop). The reason I didn't post more of the "other stuff" is because its about 300 lines of code. I posted this part because it looks to be where the most memory is added.

A: 

I think you should try calling mysql_free_result() at some point during the loop. — From the comments:

It's worth noting that mysql_query() only returns a resource for SELECT, SHOW, EXPLAIN, and DESCRIBE queries.

So there is no result to free for an update query.

Anyway, your approach is not the best to begin with. Try mysqli paramterized statements instead, or (even better) updating the rows at the database directly. It looks like all of the SQL in the loop could be handled with one single UPDATE statement.

Tomalak
There's a lot of stuff happening in the loop, I just shortened the query to make it easier to read, and focus on the part where I notice memory being added.It's my understanding that I can only use mysql_free_result() on $utmres after the loop, and that it is not needed to use it on the update query.
Jasper De Bruijn
mysql_free_result will not help for update statements. "It's worth noting that mysql_query() only returns a resource for SELECT, SHOW, EXPLAIN, and DESCRIBE queries."
OIS
@Jasper: Does the leak still exist when you reduce the loop to the pure SQL activities? Maybe the leak is in the other "stuff" that happens?
Tomalak
@OIS: Thanks, I'll update the answer!
Tomalak
I think I should start stripping parts of the "other stuff", to see if it helps. It makes no sense to me though, because I call the memory_get_usage before and after this piece of code.
Jasper De Bruijn
Also, using unset(); in such a loop will only free up memory temporarly, as variable is reassigned on each iteration. Have you verified if garbage collector is enabled (it is by default) on your server ?
Benoit
I tried Google, but is there an easy way to check if garbage collector is enabled?
Jasper De Bruijn
+1  A: 

Part of the reason you may be seeing additional used memory on every iteration is that PHP hasn't (yet) garbage collected the things that are no longer referenced.

R. Bemrose
Ok, that is useful info. When would PHP normally collect this? After the entire loop, the memory usage is around 440mb, but still going. Later in the page, in the 2nd part of the process, the page just times out without an error, which I read could be because there is not enough memory left to show the error message that there is not enough memory left.
Jasper De Bruijn
that's weird, usually, PHP crashes with a fatal OOM error saying something like "went over the memory limit of X bytes by trying to allocate Y bytes".
Adriano Varoli Piazza
Especially weird because of this line "ini_set('memory_limit', '4M');". Once every 100 loops I use "flush(); usleep(50000);", and it seems that prevents the page from timing out.
Jasper De Bruijn
+2  A: 

The best way is probably to get all userIds and flush them to a file. Then run a new script which forks with pipes to x amount of worker drones. Then just give them a small list of userIds to process as they complete each list. With multiple cpus/cores/servers you can finish the task faster. If one worker fails, just start a new one. To use other servers as workers you can call them with curl/fopen/soap/etc from a worker thread.

OIS
This might be the better solution in the long run. This script was designed a long time ago for fairly small amounts of users, and is now being tested against possible millions of users.
Jasper De Bruijn
+4  A: 

This memory leak would only be a problem if it's killing the script with a "memory exhausted" error. PHP will happily garbage collect any unusued objects/variables on its own, but the collector won't kick until it has to - garbage collection can be a very expensive operation.

It's normal to see memory usage climb even if you're constantly reusing the same objects/variables - it's not until memory usage exceeds a certain level that the collector will fire up and clean house.

I suspect that you could make things run much faster if you batched userIDs into groups and issued fewer updates, changing more records with each. e.g. do the following:

UPDATE user_roundscores SET ursUpdDate=NOW() WHERE ursUserTeamIdFK IN (id1, id2, id3, id4, id5, etc...)

instead of doing it one-update-per-user. Fewer round-trips through the DB interface layer and more time on the server = faster running.

As well, consider the impact of now expanding this to millions of users, as you say in a comment. A million individual updates will take a non-trivial amount of time to run, so the NOW() will not be a "constant". If it takes 5 minutes to do the full run, then you're going to get a wide variety of ursUpdDate timestamps. You may want to consider cacheing a single NOW() call in a server-side variable and issue the updates against that variable:

 SELECT @cachednow :p NOW();
 UPDATE .... SET ursUpDate = @cachednow WHERE ....;
Marc B
Thanks for the explanation about the memory, I was trying to get this piece of code to use no memory at all, but it sounds like that's not possible.I updated the update qry in my question, because every user receives his own calculated score, it's not possible to group the query.
Jasper De Bruijn
A: 

The unset call is pointless/irrelevant. Try with mysql_free_result though - It might have some effect.

troelskn
+1  A: 

From the php.net memory_get_usage manual:

Parameters

real_usage Set this to TRUE to get the real size of memory allocated from system. If not set or FALSE only the memory used by emalloc() is reported.

With this parameter set to true, the script showed no increase of memory, like I expected.

Jasper De Bruijn