views:

606

answers:

1

hi ,i am a newbie here ! there is a question about shared resouce with file handle between processes. here are my test code:


from multiprocessing import Process,Lock,freeze_support,Queue
import tempfile
#from cStringIO import StringIO



class File():

    def __init__(self):
        self.temp = tempfile.TemporaryFile()
        #print self.temp

    def read(self):

        print "reading!!!"
        s = "huanghao is a good boy !!"
        print >> self.temp,s
        self.temp.seek(0,0)

        f_content = self.temp.read()
        print f_content



class MyProcess(Process):

    def __init__(self,queue,*args,**kwargs):
        Process.__init__(self,*args,**kwargs)
        self.queue = queue

    def run(self):
        print "ready to get the file object"
        self.queue.get().read()
        print "file object got"
        file.read()


if __name__ == "__main__":
    freeze_support()
    queue = Queue()
    file = File()

    queue.put(file)
    print "file just put"

    p = MyProcess(queue)
    p.start()

then i got an KeyError like below:



file just put
ready to get the file object
Process MyProcess-1:
Traceback (most recent call last):
  File "D:\Python26\lib\multiprocessing\process.py", line 231, in _bootstrap
    self.run()
  File "E:\tmp\mpt.py", line 35, in run
    self.queue.get().read()
  File "D:\Python26\lib\multiprocessing\queues.py", line 91, in get
    res = self._recv()
  File "D:\Python26\lib\tempfile.py", line 375, in __getattr__
    file = self.__dict__['file']
KeyError: 'file'

i think when i put the File() object into queue , the object got serialized, and file handle can not be serialized, so, i got the KeyError:

anyone have any idea with that? if i want to share objects with file handle attribute, what shuld i do??

+1  A: 

I have to object (at length, won't just fit in a commentl;-) to @Mark's repeated assertion that file handles just can't be "passed around between running processes" -- this is simply not true in real, modern operating systems, such as, oh, say, Unix (free BSD variants, MacOSX, and Linux, included -- hmmm, I wonder what OS's are left out of this list...?-) -- sendmsg of course can do it (on a "Unix socket", by using the SCM_RIGHTS flag).

Now the poor, valuable multiprocessing is fully right to not exploit this feature (even assuming there might be black magic to implement it on Windows too) -- most developers would no doubt misuse it anyway (having multiple processes access the same open file concurrently and running into race conditions). The only proper way to use it is for a process which has exclusive rights to open certain files to pass the opened file handles to another process which runs with reduced privileges -- and then never use that handle itself again. No way to enforce that in the multiprocessing module, anyway.

Back to @Andy's original question, unless he's going to work on Linux only (AND with local processes only, too) and willing to play dirty tricks with the /proc filesystem, he's going to have to define his application-level needs more sharply and serialize file objects accordingly. Most files have a path (or can be made to have one: path-less files are pretty rare, actually non-existent on Windows I believe) and thus can be serialized via it -- many others are small enough to serialize by sending their content over -- etc, etc.

Alex Martelli
I think what I meant to say (making this at least the second correction) was file *descriptor* -- where you typically get a number >= 3 (due to 0,1,2 being reserved for std{in,out,err}). So if you open a file and it's descriptor is 3, passing 3 to another process is meaningless. Did I finally hit it there?
Mark Rushakoff
thanks a lot, Alex! So,as you said, it is very hard if i want to pass file handles between processes on windows. If i want to pass files, i should pass the file path or the content of the file, not file handles.
Andy
@Mark, not really, it's indeed the `3` you need to pass... just on an AF_UNIX socket and with the SCM_RIGHTS flag (the kernel will do the rest of the needed magic: the number that arrives may likely be != 3 but it will be a descriptor to the same open file). Solaris has a cleaner way, if I recall correctly, and actually several syscalls to deal properly with the issue (but it's been too long since I actually worked on Solaris, sigh, I don't sharply recall).
Alex Martelli
@Andy, yep, you're summarizing the gist of it perfectly -- file paths are easy, file contents are easy, handles to open files are just too hard (and I'm too rusty as a Win32 guru to even TRY to "call those spirits from the vasty depths";-).
Alex Martelli