views:

824

answers:

1

I'm trying to run the following program, which calculates roots of polynomials of degree up to d with coefficients only +1 or -1, and then store it into files.

d = 20; n = 18000; 
f[z_, i_] := Sum[(2 Mod[Floor[(i - 1)/2^k], 2] - 1) z^(d - k), {k, 0, d}];

Here f[z,i] gives a polynomial in z with plus or minus signs counting in binary. Say d=2, we would have

f[z,1] = -z2 - z - 1
f[z,2] = -z2 - z + 1
f[z,3] = -z2 + z - 1
f[z,4] = -z2 + z + 1

DistributeDefinitions[d, n, f]

ParallelDo[ 
            Do[ 
                     root = N[Root[f[z, i], j]];
                     {a, b} = Round[n ({Re[root], Im[root]}/1.5 + 1)/2];
            {i, 1, 2^d}],
{j, 1, d}]

I realise reading this probably isn't too enjoyable, but it's relatively short anyway. I would've tried to cut down to the relevant parts, but here I really have no clue what the trouble is. I'm calculating all roots of f[z,i], and then just round them to make them correspond to a point in a n by n grid, and save that data in various files.

For some reason, the memory usage in Mathematica creeps up until it fills all the memory (6 GB on this machine); then the computation continues extremely slowly; why is this?

I am not sure what is using up the memory here - my only guess was the stream of files used up memory, but that's not the case: I tried appending data to 2GB files and there was no noticeable memory usage for that. There seems to be absolutely no reason for Mathematica to be using large amounts of memory here.

For small values of d (15 for example), the behaviour is the following: I have 4 kernels running. As they all run through the ParallelDo loop (each doing a value of j at a time), the memory use increases, until they all finish going through that loop once. Then the next times they go through that loop, the memory use does not increase at all. The calculation eventually finishes and everything is fine.

Also, quite importantly, once the calculation stops, the memory use does not go back down. If I start another calculation, the following happens:

-If the previous calculation stopped when memory use was still increasing, it continues to increase (it might take a while to start increasing again, basically to get to the same point in the computation).

-If the previous calculation stopped when memory use was not increasing, it does not increase further.

Edit: The issue seems to come from the relative complexity of f - changing it into some easier polynomial seems to fix the issue. I thought the problem might be that Mathematica remembers f[z,i] for specific values of i, but setting f[z,i] :=. just after calculating a root of f[z,i] complains that the assignment did not exist in the first place, and the memory is still used.

It's quite puzzling really, as f is the only remaining thing I can imagine taking up memory, but defining f in the inner Do loop and clearing it each time after a root is calculated does not solve the problem.

+4  A: 

Ouch, this is a nasty one.

What's going on is that N will do caching of results in order to speed up future calculations if you need them again. Sometimes this is absolutely what you want, but sometimes it just breaks the world. Fortunately, you do have some options. One is to use the ClearSystemCache command, which does just what it said on the tin. After I ran your un-parallelized loop for a little while (before getting bored and aborting the calculation), MemoryInUse reported ~160 MiB in use. Using ClearSystemCache got that down to about 14 MiB.

One thing you should look at doing, instead of calling ClearSystemCache programmatically, is to use SetSystemOptions to change the caching behavior. You should take a look at SystemOptions["CacheOptions"] to see what the possibilities are.

EDIT: It's not terribly surprising that the caching causes a bigger problem for more complex expressions. It's got to be stashing copies of those expressions somewhere, and more complex expressions require more memory.

Pillsy
Hmm, I'm having trouble replicating your results. At the moment, two problems occur: in the unparallelized version, when I call ClearSystemCache, MemoryInUse does report that the memory in use has gone back down, but the task manager shows the kernel still using as much memory. Secondly, in parallelized mode, I cannot find the option to clear the cache of the individual kernels. But you seemed to have found the precise cause, now it's more a matter of finding how to treat it.
Sam Derbyshire
Messing with the CacheOptions did not prove fruitful either - I set everything to false and max byte sizes to 0 and it made no difference (to the unparallelized version and so no difference to the parallelized version either).
Sam Derbyshire
Ok, adding ClearSystemCache does in fact work; for some reason it didn't work the first time, but now it does work. It even works in the parallel version. Thanks!
Sam Derbyshire