views:

1762

answers:

5

I have a thread that appends rows to self.output and a loop that runs until self.done is True (or the max execution time is reached).

Is there a more efficient way to do this other than using a while loop that constantly checks to see if it's done. The while loop causes the CPU to spike to 100% while it's running..

time.clock()
while True:

    if len(self.output):
        yield self.output.pop(0)

    elif self.done or 15 < time.clock():
        if 15 < time.clock():
            yield "Maximum Execution Time Exceeded %s seconds" % time.clock()
        break
A: 

Use time.sleep(seconds) to create a brief pause after each iteration of the while loop to relinquish the cpu. You will have to set the time you sleep during each iteration based on how important it is that you catch the job quickly after it's complete.

Example:

time.clock()
while True:

    if len(self.output):
        yield self.output.pop(0)

    elif self.done or 15 < time.clock():
        if 15 < time.clock():
            yield "Maximum Execution Time Exceeded %s seconds" % time.clock()
            break

    time.sleep(0.01) # sleep for 10 milliseconds
marcog
sleep usually results in bad performance. you should consider synchronization before using sleeps.
Francis
A: 

You have to use a synchronization primitive here. Look here: http://docs.python.org/library/threading.html.

Event objects seem very simple and should solve your problem. You can also use a condition object or a semaphore.

I don't post an example because I've never used Event objects, and the alternatives are probably less simple.


Edit: I'm not really sure I understood your problem. If a thread can wait until some condition is statisfied, use synchronization. Otherwise the sleep() solution that someone posted will about taking too much CPU time.

Bastien Léonard
A: 

use mutex module or event/semaphore

Francis
+1  A: 

Use a semaphore; have the working thread release it when it's finished, and block your appending thread until the worker is finished with the semaphore.

ie. in the worker, do something like self.done = threading.Semaphore() at the beginning of work, and self.done.release() when finished. In the code you noted above, instead of the busy loop, simply do self.done.acquire(); when the worker thread is finished, control will return.

Edit: I'm afraid I don't address your needed timeout value, though; this issue describes the need for a semaphore timeout in the standard library.

esm
+7  A: 

Are your threads appending to self.output here, with your main task consuming them? If so, this is a tailor-made job for Queue.Queue. Your code should become something like:

import Queue

# Initialise queue as:
queue = Queue.Queue()
Finished = object()   # Unique marker the producer will put in the queue when finished

# Consumer:
try:
    while True:
        next_item = self.queue.get(timeout=15)
        if next_item is Finished: break
        yield next_item

except Queue.Empty:
    print "Timeout exceeded"

Your producer threads add items to the queue with queue.put(item)

[Edit] The original code has a race issue when checking self.done (for example multiple items may be appended to the queue before the flag is set, causing the code to bail out at the first one). Updated with a suggestion from ΤΖΩΤΖΙΟΥ - the producer thread should instead append a special token (Finished) to the queue to indicate it is complete.

Note: If you have multiple producer threads, you'll need a more general approach to detecting when they're all finished. You could accomplish this with the same strategy - each thread a Finished marker and the consumer terminates when it sees num_threads markers.

Brian
OoOooo, now we're talking. :D
Ian
Is there a way to tell a thread blocking on a Queue.get() without a timeout that the producer is done putting anything in the thread so it could exit cleanly?
Sii
@Sii: You could mark the thread daemonic when you create it. This means the thread will exit when your program exits.
John Fouhy
The producer should Queue.put a special marker that it's done. Either do a `done_marker= object()` and use that, or you could use the Ellipsis object (otherwise useless, typically).
ΤΖΩΤΖΙΟΥ
I should note that your example has a flaw, `if self.done: break` should be changed to: `if self.done and self.queue.empty(): break`, otherwise the last items in the queue wont necessarily be included. took me a bit of toying with it to figure that out.
Ian
You're right - the Done marker is probably the best way to go to avoid race issues. I'll update the code.
Brian