views:

696

answers:

7

Is there a way to fill the free RAM on a linux machine with random data?

The reason I'm asking this: I'm working in a group where we do numerical programming in Fortran. Sometimes, people mess up working with double precision reals, so that programs that should give double precision results only give single precision.

If my understanding is correct, one would see random fluctuations of the result after the single precision limit in such a program. That is, if you run the same program with the same input several times, you get randomly different result each time. What you see (the random part) depends on the random values in the free RAM of the machine. But in practice, if you run the program repeatedly on the same machine, the same parts of memory tend to be used repeatedly, which have the same random data in them, leading the same output every time.

My idea is that if you could overwrite the memory with random data, you would actually see the random fluctuations in your program output. That would make it a lot easier to find these bugs.

Is this idea whack, or if not, how do I fill the memory? Can I pipe /dev/random into the RAM, or something?

+1  A: 

I would think that random data would make debugging much much harder. Is the randomness in the answers caused by random values in memory or a calculation bug? I would thnk fixed and known values would be better.

On the FORTRAN side, are you saying 'mixed precision' numbers are used interchangably? I'm not clear on the actual problem.

But I have no idea how to fill free memory in Linux with anything.

n8wrl
The problem can occur for example if you do a conversion, and forget to make the precision explicit. For example (with idp=8 for double precision):real(idp) :: a;complex(idp) :: b;a = 1.0_idp;b = cmplx(a, idp);If you forget the 'idp' in the call to to cmplx (which happens easily), the resulting value of b will only be a copy of a up to the single precision limit. The remaining digits can have random fluctuations, which depend on how the memory was previously used. There are other examples for problems like that, too. If you can actually force the fluctuations, it's easier to debug.
Michael Goerz
+2  A: 

If you have a recent (>=2.4 it seems) glibc you can use set the environment variable MALLOC_PERTURB_ to make malloc() return memory that is set to some value. See http://udrepper.livejournal.com/11429.html and inside http://people.redhat.com/drepper/defprogramming.pdf

Then the question is if your Fortran program uses the glibc malloc(), I guess it depends on the Fortran compiler.

uekstrom
Also note that, contrary to what many people seem to believe, memory that you get from malloc() (or allocate() in Fortran) is not guaranteed to be zeroed, although many operating systems offer this as an option.
uekstrom
+2  A: 

I would try writing unit tests using something like fUnit to ensure that double precision values always work as expected by writing some tests that require a double precision result in cases where a single precision result is being stored often shows up.

Eg: write a test that calls a function with various inputs that should generate double precision outputs, and test that this works with an assert().

sheepsimulator
+5  A: 

Your understanding is incorrect. You cannot fill a program's memory with random data before it starts executing, and even if you could, it wouldn't solve your problem.

If your Fortran program declares a single precision floating point variable, the compiler will allocate a 32 bit cell in memory to hold the value. Each time your program reads from the variable, the processor will fetch a 32 bit value from the cell. Each time you assign to the variable, the processor will write a 32 bit value to the cell. Under no circumstances should random bits "bleed" into the value from the cells before or after the cell.

While floating point arithmetic is not precise, it is not random either. If you calculate 1.0 / 3.0 + 1.0 / 3.0 + 1.0 / 3.0) one thousand times, you will get 0.99999... each and every time.

The second point is that when a program is executed on Linux, all data memory is carefully preinitialized to zero by the operating system. This is done to avoid your program behaving differently each time you run it: that would be a BAD THING. EDIT: another reason this is done is to prevent leakage of private information from one process to another.

(Commenters: please note that I've deliberately skated over a number of issues to make the explanation simple.)

Stephen C
I'm aware about the imprecisions of floating point arithmetic, like you illustrate. That's exactly the point. If the program has these kinds of bugs, the results are imprecise but not random. However, if there are single/double precision conversion issues, the results are random. I'm quite sure that in fortran memory is not initialized unless you manually request it. That means that the previous use of that memory location can have a impact in the form of random fluctuations. These things can also be compiler-dependent.
Michael Goerz
"I'm quite sure that in fortran memory is not initialized unless you manually request it." If you run on a modern multi-user operating system, I can guarantee that the memory that any program starts executing in WILL be initialized. Otherwise one program can pick up private information left in memory when another one exits or dies.
Stephen C
+1  A: 

You've asked for help ito implement your solution to a problem, being the memory randomization. However, I feel that it is an odd and possibly hard to debug solution.

It appears to me that you would benefit more from - statical code analysis tools - specific unit testing - checklists for code review, specifically targetted at this problem

Sometimes, one can think of solutions even simpler; if you can do without single precision math you might prevent the linking of such libraries, so the error would show up a link error; early in your development process. Good luck.

Adriaan
+1  A: 

What you want to achieve, although noble in intent, and interestingly conceived, remembers me of the Wile E. Coyote plans to catch the roadrunner, while a rifle and a sniping action would have been the best option.

If you have the problem you present, it means that there is a structural problem in your code, and you are losing control of your program. Although I perfectly know how software is developed in academia, and in fortran, throwing yourself down the cliff just because the rest of the world does it is problematic.

What you should do is an audit of your code. and then beat some grad student if he messes it up again.

Stefano Borini
It's not actually my program, it's just another guy in my group that asked me for help. He has an old complicated mess of a program, and he found out that his results fluctuate if he changes something unrelated. I'm pretty sure it's a double/single precision problem. Besides... I'm the grad student (but I wasn't the one who messed up ;) ) In any case, doing a full audit of his program is out of the question, it's much too messy for that and would take too much time.
Michael Goerz
depends on what he changes, how he changes. Who knows ? could be as you say, but who can really say this for sure ? I see your point in trying to debug this, but still, assuming you actually find out that it is a precision issue, you still don't know where it occurs, so you will need an audit anyway.
Stefano Borini
+1  A: 

Linux provides you with /proc/pid/maps and /proc/pid/mem, for your own pleasure. Of course you have to be extra careful when writing there. Also, keep in mind the only memory segment available to each process is its own, so you'll probably have to do some attaching and code patching to get where you want. Good luck, anyways. :)

edit: It's still quite a few times more complicated than a code audit - which also has greater chances to reveal the actual source of the problem.

Michael Foukarakis