views:

452

answers:

7

I have threaded code where each thread needs to write to the same file. To prevent concurrency issues, I am using a Lock object.

My question is whether I am using the Lock correctly. If I set the lock from within each thread, is that lock global or only specific to that specific thread?

Basically, should I create a Lock first and pass its reference to each thread, or is it ok to set it from within the thread like I do here:

import time
from threading import Thread, Lock

def main():
    for i in range(20):
        agent = Agent(i)
        agent.start()

class Agent(Thread):
    def __init__(self, thread_num):
        Thread.__init__(self)
        self.thread_num = thread_num

    def run(self):
        while True:
            print 'hello from thread %s' % self.thread_num
            self.write_result()   

    def write_result(self):
        lock = Lock()
        lock.acquire()
        try:
            f = open('foo.txt', 'a')
            f.write('hello from thread %s\n' % self.thread_num)
            f.flush()
            f.close()
        finally:
            lock.release()

if __name__ == '__main__':
    main()
+2  A: 

Create the lock outside the method.

class Agent(Thread):
    mylock = Lock()
    def write_result(self):
        self.mylock.acquire()
        try:
            ...
        finally:
            self.mylock.release()

or if using python >= 2.5:

class Agent(Thread):
    mylock = Lock()
    def write_result(self):
        with self.mylock:
            ...

To use that with python 2.5 you must import the statement from the future:

from __future__ import with_statement
nosklo
yes, you moved it outside the method, but you are still creating it in the thread itself. Isn't that an issue?
Corey Goldberg
@cgoldberd: it is being created as a *class* attribute, that means a single one will be created for all threads. That is a better place to keep it because everything stays on the thread class.
nosklo
gotcha. I like that approach
Corey Goldberg
+1  A: 

The lock() method returns a lock object for every call. So every thread ( actually every call to write_result ) will have a different lock object. And there will be no locking.

Igal Serban
+1  A: 

The lock that's used needs to be common to all threads, or at least ensure that two locks can't lock the same resource at the same time.

Matthew Brubaker
A: 

I'm pretty sure that the lock needs to be the same object for each thread. Try this:

import time
from threading import Thread, Lock

def main():
    lock = Lock()
    for i in range(20):
        agent = Agent(i, lock)
        agent.start()

class Agent(Thread, Lock):
    def __init__(self, thread_num, lock):
        Thread.__init__(self)
        self.thread_num = thread_num
        self.lock = lock

    def run(self):
        while True:
            print 'hello from thread %s' % self.thread_num
            self.write_result()   

    def write_result(self):
        self.lock.acquire()
        try:
            f = open('foo.txt', 'a')
            f.write('hello from thread %s\n' % self.thread_num)
            f.flush()
            f.close()
        finally:
            lock.release()

if __name__ == '__main__':
    main()
Joseph Bui
-1: that won't work.
nosklo
nosklo. why won't that work? The lock is created first and then passed into each thread.
Corey Goldberg
Sorry, I forgot to actually pass the lock object to the Agent constructor. Corrected.
Joseph Bui
now that you fixed it removing my vote
nosklo
thanks.. this is was the other approach I was thinking
Corey Goldberg
+1  A: 

The lock instance should be associated with the file instance.

In other words, you should create both the lock and file at the same time and pass both to each thread.

MSN

MSN
+5  A: 

For your use case one approach could be to write a file subclass that locks:

class LockedWrite(file):
    """ Wrapper class to a file object that locks writes """
    def __init__(self, *args, **kwds):
        super(LockedWrite, self).__init__(*args, **kwds)
        self._lock = Lock()

    def write(self, *args, **kwds):
        self._lock.acquire()
        try:
            super(LockedWrite, self).write(*args, **kwds)
        finally:
            self._lock.release()

To use in your code just replace following functions:

def main():
    f = LockedWrite('foo.txt', 'a')

    for i in range(20):
        agent = Agent(i, f)
        agent.start()

class Agent(Thread):
    def __init__(self, thread_num, fileobj):
        Thread.__init__(self)
        self.thread_num = thread_num
        self._file = fileobj    

#   ...

    def write_result(self):
        self._file.write('hello from thread %s\n' % self.thread_num)

This approach puts file locking in the file itself which seems cleaner IMHO

nosklo
Except the pointless "filelock = Lock()" part, I like this solution better than mine. It would be even better if it somehow intercepted other attempts to open foo.txt for writing/appending and returned to original LockedWrite object.
Joseph Bui
@Joseph: removed the pointless part.
nosklo
You got my vote.
Joseph Bui
+1  A: 

You can simplify things a bit (at the cost of slightly more overhead) by designating a single thread (probably created exclusively for this purpose) as the sole thread that writes to the file, and have all other threads delegate to the file-writer by placing the string that they want to add to the file into a queue.Queue object.

Queues have all of the locking built-in, so any thread can safely call Queue.put() at any time. The file-writer would be the only thread calling Queue.get(), and can presumably spend much of its time blocking on that call (with a reasonable timeout to allow the thread to cleanly respond to a shutdown request). All of the synchronization issues will be handled by the Queue, and you'll be spared having to worry about whether you've forgotten some lock acquire/release somewhere... :)

Jeff Shannon
I normally do this stuff with Queues and a single writer also. I just needed some clarification on locks.
Corey Goldberg