views:

33

answers:

1

Hi Guys,

I am writing a set of classes for a crawler, it crawls a start page, pulls three links based on parameters (found using Simple Html Dom Parser allowing use of jquery like selectors), crawls those pages, then goes to page 2, picks the next 3 pages. Current max pages is 57 times.

Needless to say I am getting:

Allowed memory size of 50331648 bytes exhausted error message.

Is there any way I can avoid running out of memmory.

To let you know, after pulling in the contents of the first page, I run a go() function, which continuosly pulls in the pages until $this->maxpages is reached. I suppose I could run the loop when instantiating the classes, but would this help.

+1  A: 

You can adjust the memory limit:

ini_set('memory_limit', '128M');

But I'd try to make the script use less memory. Make sure you are freeing data / references to anything that no longer needs to exist.

memory_get_usage() can be useful in debugging where the memory usage is accumulating.

Also, if you aren't using PHP 5.3, you may consider upgrading since its garbage collector is better.

konforce
The perfect answer, and I will mark it as such. Just, can you give some examples of how to free data / references - is that like unsetting vars etc?
Liam Bailey
If there is no reference to some object, it will eventually be destroyed. So `$foo = new Foo(); $foo = new Foo();` won't leak any data... You don't need to `unset()` the first instance. But circular referencing (e.g., parent references child, and child references parent) will not be freed < 5.3. Thus, things like that will build up over the duration of the script. Search for PHP circular reference topics for more information.
konforce