views:

48

answers:

1

I have a python module which generates large data files which I want to cache on disk for future use. The cache is likely to end up some hundreds of MB for a normal user, but save a lot of computation time.

The files aren't distributed with the module, but are generated the first time the code is run with a given set of parameters.

So far I've just been using a single file module myself and putting them in a hardcoded path relative to the module (data/). But I now need to distribute this module in a Python package with distutils and I was wondering if there is a standard way to do that.

I was thinking of something like the compiled cache of scipy.weave - but wondering if there is a more modern supported way of doing it. On *nix platforms I would expect it to go in ~/.something but I'm not sure what the windows equivalent would be. Also this should configurable so that users can point it somewhere else if it's more convenient, or to share the cache dir between users. How should such a config file work? Where should it go?

Or should I just have it as an install option, either through a config file next to setup.py or set by manually editing setup.py, then hard code the directory in the module before installation?

Any pointers greatfully received...

+3  A: 

You can use the standard library module ConfigParser to parse an ini file (or .rc file depending on your culture). To find the file, os.path.expanduser is a useful function that does the right thing on all platforms for paths like "~/.mytoolrc". To let the user override the location of things, you can use environment variables via os.environ.

Ned Batchelder
Thanks - so I guess default to a ~/.mycache or ~/_mycache directory. Look for a ~/.my_module or ~/_my_module config file and if present get the directory from that. Then default should work for most people but it's easy to configure. I would prefer to avoid environment variables since I don't think Windows people usually like having to set stuff like that.
thrope