Let's assume I'm stuck using Python 2.6, and can't upgrade (even if that would help). I've written a program that uses the Queue class. My producer is a simple directory listing. My consumer threads pull a file from the queue, and do stuff with it. If the file has already been processed, I skip it. The processed list is generated before all of the threads are started, so it isn't empty.
Here's some pseudo-code.
import Queue, sys, threading
processed = []
def consumer():
while True:
file = dirlist.get(block=True)
if file in processed:
print "Ignoring %s" % file
else:
# do stuff here
dirlist.task_done()
dirlist = Queue.Queue()
for f in os.listdir("/some/dir"):
dirlist.put(f)
max_threads = 8
for i in range(max_threads):
thr = Thread(target=consumer)
thr.start()
dirlist.join()
The strange behavior I'm getting is that if a thread encounters a file that's already been processed, the thread stalls out and waits until the entire program ends. I've done a little bit of testing, and the first 7 threads (assuming 8 is the max) stop, while the 8th thread keeps processing, one file at a time. But, by doing that, I'm losing the entire reason for threading the application.
Am I doing something wrong, or is this the expected behavior of the Queue/threading classes in Python 2.6?