tags:

views:

2479

answers:

6

In hopes of trying to avoid future memory leaks in php programs (drupal modules, etc.) I've been messing around with simple php scripts that leak memory.

Could a php expert help me find what about this script causes the memory usage to continually climb?

Try running it yourself, changing various parameters. The results are interesting. Here it is:

<?php

function memstat() {
  print "current memory usage: ". memory_get_usage() . "\n";
}

function waste_lots_of_memory($iters) {
  $i = 0;
  $object = new StdClass;
  for (;$i < $iters; $i++) {
    $object->{"member_" . $i} = array("blah blah blha" => 12345);
    $object->{"membersonly_" . $i} = new StdClass;
    $object->{"onlymember"} = array("blah blah blha" => 12345);
  }
  unset($object);
}

function waste_a_little_less_memory($iters) {
  $i = 0;
  $object = new StdClass;
  for (;$i < $iters; $i++) {

    $object->{"member_" . $i} = array("blah blah blha" => 12345);
    $object->{"membersonly_" . $i} = new StdClass;
    $object->{"onlymember"} = array("blah blah blha" => 12345);

    unset($object->{"membersonly_". $i});
    unset($object->{"member_" . $i});
    unset($object->{"onlymember"});

  }
  unset($object);
}

memstat();

waste_a_little_less_memory(1000000);

memstat();

waste_lots_of_memory(10000);

memstat();

For me, the output is:

current memory usage: 73308
current memory usage: 74996
current memory usage: 506676

[edited to unset more object members]

+2  A: 

My understanding of memory_get_usage() is that it's output can depend on a wide range of operating system and version factors.

More importantly, unsetting a variable does not instantly free it's memory, deallocate it from the process, and give it back to the operating system (again, characteristics of this operation are operating system dependent).

In short, you probably need a more complicated setup to look at memory leaks.

marr75
+14  A: 

unset() doesn't free the memory used by a variable. The memory is freed when the "garbage collector" (in quotes since PHP didn't have a real garbage collector before version 5.3.0, just a memory free routine which worked mostly on primitives) sees fit.

Also, technically, you shouldn't need to call unset() since the $object variable is limited to the scope of your function.

Here is a script to demonstrate the difference. I modified your memstat() function to show the memory difference since the last call.

<?php
function memdiff() {
    static $int = null;

    $current = memory_get_usage();

    if ($int === null) {
     $int = $current;
    } else {
     print ($current - $int) . "\n";
     $int = $current;
    }
}

function object_no_unset($iters) {
    $i = 0;
    $object = new StdClass;

    for (;$i < $iters; $i++) {
     $object->{"member_" . $i}= array("blah blah blha" => 12345);
     $object->{"membersonly_" . $i}= new StdClass;
     $object->{"onlymember"}= array("blah blah blha" => 12345);
    }
}

function object_parent_unset($iters) {
    $i = 0;
    $object = new StdClass;

    for (;$i < $iters; $i++) {
     $object->{"member_" . $i}= array("blah blah blha" => 12345);
     $object->{"membersonly_" . $i}= new StdClass;
     $object->{"onlymember"}= array("blah blah blha" => 12345);
    }

    unset ($object);
}

function object_item_unset($iters) {
    $i = 0;
    $object = new StdClass;

    for (;$i < $iters; $i++) {

     $object->{"member_" . $i}= array("blah blah blha" => 12345);
     $object->{"membersonly_" . $i}= new StdClass;
     $object->{"onlymember"}= array("blah blah blha" => 12345);

     unset ($object->{"membersonly_" . $i});
     unset ($object->{"member_" . $i});
     unset ($object->{"onlymember"});
    }
    unset ($object);
}

function array_no_unset($iters) {
    $i = 0;
    $object = array();

    for (;$i < $iters; $i++) {
     $object["member_" . $i] = array("blah blah blha" => 12345);
     $object["membersonly_" . $i] = new StdClass;
     $object["onlymember"] = array("blah blah blha" => 12345);
    }
}

function array_parent_unset($iters) {
    $i = 0;
    $object = array();

    for (;$i < $iters; $i++) {
     $object["member_" . $i] = array("blah blah blha" => 12345);
     $object["membersonly_" . $i] = new StdClass;
     $object["onlymember"] = array("blah blah blha" => 12345);
    }
    unset ($object);
}

function array_item_unset($iters) {
    $i = 0;
    $object = array();

    for (;$i < $iters; $i++) {
     $object["member_" . $i] = array("blah blah blha" => 12345);
     $object["membersonly_" . $i] = new StdClass;
     $object["onlymember"] = array("blah blah blha" => 12345);

     unset ($object["membersonly_" . $i]);
     unset ($object["member_" . $i]);
     unset ($object["onlymember"]);
    }
    unset ($object);
}

$iterations = 100000;

memdiff(); // Get initial memory usage

object_item_unset ($iterations);
memdiff();

object_parent_unset ($iterations);
memdiff();

object_no_unset ($iterations);
memdiff();

array_item_unset ($iterations);
memdiff();

array_parent_unset ($iterations);
memdiff();

array_no_unset ($iterations);
memdiff();
?>

If you are using objects, make sure the classes implements __unset() in order to allow unset() to properly clear resources. Try to avoid as much as possible the use of variable structure classes such as stdClass or assigning values to members which are not located in your class template as memory assigned to those are usually not cleared properly.

PHP 5.3.0 and up has a better garbage collector but it is disabled by default. To enable it, you must call gc_enable() once.

Andrew Moore
@mjgoins: I edited my post with more information. The default memory collector works best on primitive types. As soon as you start introducing resources and objects to the mix, it starts to fail.
Andrew Moore
So the answer is until php 5.3, php *cannot* run an arbitrarily long job in fixed memory space. Even my function that leaks less memory will slowly run out, although with high memory allocation it would take many days.
mjgoins
@mjgoins: I run multiple tasks which may take anywhere from a minute to multiple hours. The trick is to avoid objects (or try to reuse them if you must) and stick with primitives (in your example, you can easily use arrays).
Andrew Moore
stdClass is especially bad to use because it has a variable structure. For most user-built classes, the mem-collector (referring to the simple gc) can always base itself on the class template to know what to clear.
Andrew Moore
Avoiding objects isn't possible if you're re-using other folks' code, or interfacing with a larger system that is based around objects (e.g. drupal).Thanks for the idea though. I will avoid objects in situations like this whenever possible.
mjgoins
@mjgoins: If you are using objects, make sure that the classes implement __unset() to unset resources used by that class.
Andrew Moore
Edited my post with more explanations.
Andrew Moore
@Andrew Moore: Are you speculating about StdClass or do you know something I don't? Regular classes can also have new, undeclared members assigned.
troelskn
@treolskn: Can, but not usually. Of course, if your class declaration doesn't have a particular variable in it's template, it will not be collected by the mem-collector.
Andrew Moore
@mjgoins: feel free to mark my answer as accepted.
Andrew Moore
@Andrew Moore: thanks. I'm new to stackoverflow, obviously.
mjgoins
I added some more functions to prove my point.
Andrew Moore
A: 

I'm not sure about the exact workings of it in PHP, but in some other languages an object containing other objects, when set to null, does not inherently set the other objects to null. It terminates the reference to those objects, but as PHP does not have "garbage collection" in a Java sense, the sub-objects exist in memory until they are removed individually.

Jason
A: 

I believe this comment in the User Contributed Notes for unset() tell the story you're seeing in your test code.

I'm not sure I can be of more help than that.

Peter Bailey
+1  A: 

memory_get_usage reports how much memory php has allocated from the os. It doesn't necessarily match the size of all variables in use. If php has a peak use of memory, it may decide not to return the unused amount of memory right away. In your example, the function waste_a_little_less_memory unsets unused variables over time. So the peak usage is relatively small. The waste_lots_of_memory builds up a lot of variables (=lots of used memory) before deallocating it. So the peak usage is much larger.

troelskn
+3  A: 

memory_get_usage() "Returns the amount of memory, in bytes, that's currently being allocated to your PHP script."

That's the amount of memory allocated to the process by the OS, not the amount of memory used by assigned variables. PHP does not always release memory back to the OS -- but that memory can still be re-used when new variables are allocated.

Demonstrating this is simple. Change the end of your script to:

memstat();
waste_lots_of_memory(10000);
memstat();
waste_lots_of_memory(10000);
memstat();

Now, if you're correct, and PHP is actually leaking memory, you should see memory useage grow twice. However, here's the actual result:

current memory usage: 88272
current memory usage: 955792
current memory usage: 955808

This is because memory "freed" after the initial invocation of waste_lots_of_memory() is re-used by the second invocation.

In my 5 years with PHP, I've written scripts that have processed millions of objects and gigabytes of data over a period of hours, and scripts that have run for months at a time. PHP's memory management isn't great, but it's not nearly as bad as you're making it out to be.

Frank Farmer