views:

351

answers:

3

I am writing a python package with modules that need to open data files in a ./data/ subdirectory. Write now I have the paths to the files hardcoded into my classes and functions. I would like to write more robust code that can access the subdirectory regardless of where it is installed on the user's system.

I've tried a variety of methods, but so far I have had no luck. It seems that most of the "current directory" commands return the directory of the system's python interpreter, and not the directory of the module.

This seems like it ought to be a trivial, common problem. Yet I can't seem to figure it out. Part of the problem is that my data files are not .py files, so I can't use import functions and the like.

Any suggestions?

Right now my package directory looks like:

/
__init__.py
module1.py
module2.py
data/   
   data.txt

I am trying to access data.txt from module*.py

Thanks!

+1  A: 

I think I hunted down an answer.

I make a module data_path.py, which I import into my other modules containing:

data_path = os.path.join(os.path.dirname(__file__),'data')

And then I open all my files with

open(os.path.join(data_path,'filename'), <param>)
Jacob Lyles
+3  A: 

You can use underscore-underscore-file-underscore-underscore (__file__) to get the path to the package, like this:

import os
this_dir, this_filename = os.path.split(__file__)
DATA_PATH = os.path.join(this_dir, "data", "data.txt")
print open(DATA_PATH).read()
RichieHindle
A: 

This solution fails if your tree looks like this (which seems for me the most natural solution):

/
    package1/
        __init__.py
        i_need_data.py
    data/
        data.txt

The problem is how to get from the module package1.i_need_data to the file data.txt. Two solutions come to my mind, none of which is elegant and resistant to source changes:

  1. To hardcode $this_dir/../data. This is dirty -- the path is coded in two places -- in the directory structure (import statements) and in this path.
  2. To make data dir a package (place init.py there) and refer to it with file. This seems dirty, as data is definitely not a package in the pythonic sense.

Any ideas?

eliasz