views:

343

answers:

6

I understood that in certain Windows XP programs, like Photoshop, there is something called "scratch disks". What I understood that this means, and please correct me if I'm wrong, is that Photoshop manages its own virtual memory on the hard-drive, instead of letting Windows manage it. I understood that the reason for this is some limitation by Windows XP on how much total memory a process can take, regardless of HD space. I think it's around 3 GB. Did I get it right so far?

I am making an application in Python for running simulations. It will take a lot of memory, and will run on Windows XP. Is it possible for it to use scratch disks? How?

+1  A: 

Scratch disks will benefit your application in the case that it works with very big files,

Is that the case?

If not, then i don't think you may find something that will benefit your application in scratch disks.

Konstantinos
Very big objects in memory. Does that qualify?
cool-RR
i suppose that yes it does since its pretty much the same, Photoshop does not hold their entirety in memory thus the temporary space
Konstantinos
+5  A: 

Until you ACTUALLY run out of memory, thinking about this is a waste of time.

When you finally do run out of memory, you'll need to use a temporary file to store objects that your process needs, but can't fit into memory.

Use pickle or shelve (see Data Persistence) your objects in a file. If that file happens to be on a disk named "scratch", well that's nice.

Sometimes you want your temporary files to be on a separate disk from your other working files for performance reasons. In some environments (SAN, NAS, storage arrays) your disks are virtual and looking for a "scratch" disk doesn't have any performance benefit. In other environments (i.e., you own all the hardware) you can put temporary files on some other drive, making that drive a "scratch" disk.

S.Lott
I know I can write to file. But I am hoping there is some solution that will manage that for me, so I could create objects in Python as big as I want, and it will manage the temporary files by itself. Is there such a thing?
cool-RR
Yes. I said you needed to read the Data Persistence section of the Python library. The section I said you should read includes pickle, shelve, cPickle and other modules that save objects to files.
S.Lott
Agreed on waiting until you actually do run out of memory to optimize (premature optimization == bad) but should still plan for the possibility in design.
Shane C. Mason
"planning for the possibility" has often been an excuse for wasting time on an attractive nuisance like this. Change the algorithm to be a pipeline and you suddenly don't need any of this.
S.Lott
+1  A: 

Memory mapped files might be what you are looking for. Python's implementation lets you use a file like a mutable string in memory.

Shane C. Mason
Looks interesting, I'll check it out.
cool-RR
Because they're mapped to a memory range, this doesn't help with the maximum-process-size limit.
bobince
+2  A: 

I understood that the reason for this is some limitation by Windows XP on how much total memory a process can take, regardless of HD space. I think it's around 3 GB.

Just an FYI, this is more a limitation of a 32-bit OS rather than being a Windows XP problem. You'll have the same problem in 32-bit Vista, linux, bsd... you get the idea. If you go the 64-bit route, you don't have these problems.

For example, Windows XP x64 allows up to 8 terabytes of memory per process.

Jason Baker
Thanks for the info. However, my program must work on Win32.
cool-RR
You can access up to 64 GB on a 32-bit OS, if it takes advantage of PAE (which non-server Windows don't).FreeBSD supports PAE but they recommend switching to 64-bit instead. I don't know about GNU/Linux.
Bastien Léonard
@cool-RR: I understand this. I just wanted to point out that it isn't necessarily a weird Windows issue. @Bastien: Technically, that would be a 36-bit OS. :-P
Jason Baker
+1  A: 

The Win32 API provides this: link text. You may be able to use these functions through PyWin32.

Bastien Léonard
Looks very interesting! I think this might be it.
cool-RR
Using memory mapped io might be even simpler. That is, if Python can actually support it. http://msdn.microsoft.com/en-us/library/ms810613.aspx
Jasper Bekkers
+1  A: 

You could combine S.Lott's answer about using pickle (you should use cPickle though for better performance) with SqlLite.

sqlite is built into python 2.5 and up, so all you'll need to do is import :), then just store the pickled objects as strings in there and you'll have a nice fast method of accessing the data (compared to building your own method) that will help keep you organized as well.

note: cPickle is almost identical to pickle in use. Only difference is that it is written in C

Useful Python Docs:

edit: It may be a good idea to have a user controlled memory usage limit. It would be a shame to be storing a bunch a data on disk and waiting on slow-ass disk I/O when the user has 8GB of RAM ;)

Jiaaro
Then I would have to pickle the objects myself, wouldn't I?
cool-RR
I want something that will do it transparently for me. Right now for example, I can run Python on a computer with 100 MB of RAM and load an object of 2 GB into memory, and the majority of it will be in virtual memory, without me knowing about it. That's what I want, except for >4GB. Got it?
cool-RR
Sorry I think if you want something like that you're going to have to write code to handle it. If I were you I would write functions for any code that deals with a large file, and have them read the needed data at the beginning, and write at the end
Jiaaro