views:

256

answers:

3

I want to tell my Python threads to yield, and so avoid hogging the CPU unnecessarily. In Java, you could do that using the Thread.yield() function. I don't think there is something similar in Python, so I have been using time.sleep(t) where t = 0.00001. For t=0 there seems to be no effect.

I think that maybe there is something I am not understanding correctly about Python's threading model, and hence the reason for the missing thread.yield(). Can someone clarify this to me? Thanks!

PS: This is what the documentation for Java's Thread.yield() says:

Causes the currently executing thread object to temporarily pause and allow other threads to execute.

+2  A: 

Dup of: How does a threading.Thread yield the rest of its quantum in Python?.

time.sleep(0)
Qberticus
My problem is that `time.sleep(0)` does not seem to have any effect. Thanks for the reference.
Carlos Rocha
hrms, odd, if it doesn't work something else weird must be happening. :)
Qberticus
+1  A: 

The interpreter will switch from one thread to another periodically anyway without your intervention - you don't need to tell the system not to 'hog' a thread.

However, under normal circumstances, only one Python thread is executing at any one time. (Exceptions tend to revolve around times when threads are waiting on input from external devices such as the hard disk or the network.) This is due to the Global Interpreter Lock. This does mean however that you probably aren't getting as much benefit from threads in Python as you would in Java or many other languages. Getting around this problem is not necessarily trivial, although moving to multiprocessing instead of multithreading is one good approach, if possible.

However, what you want to do seems flawed in some sense - if you have 2 threads and they both have work to do, you shouldn't have to write application-side code to switch between them. That's the job of the operating system or the virtual machine. If you find yourself telling one thread to 'do less' because you want to favour another thread in terms of processor time, then rather than adding arbitrary yielding or sleeping, you should probably be setting thread priorities instead. (Although again, this may not have much meaning in Python given the presence of the Global Interpreter Lock.)

Kylotan
What happens when a thread is not yielding, is that it starts to starve the other threads. Java and C (pthreads) have yielding functions to tell the scheduler that the thread is "free". The answer here: http://stackoverflow.com/questions/787803/how-does-a-threading-thread-yield-the-rest-of-its-quantum-in-python/787810#787810 seems to give some further insight, but it seems that I can not get around it.
Carlos Rocha
I think you are misunderstanding how threads work on most systems. The system will switch between threads automatically on a very regular basis so all threads get to run. (This is called 'pre-emption' or 'preemptive multitasking'.) You don't need to manually switch between them on Java or Python. Methods such as yield() on such systems are a minor optimisation, not a requirement.
Kylotan
No, no. I know about pre-emption. The `yield` I am referring to (in the case of Java or pthreads) is simply advising the scheduler. It is a way to suggest a pre-emption point. As you say it is an optimization. I need to go and read more about the GIL. Thanks.
Carlos Rocha
If your thread is wasting time doing something you don't need to do all that often, it would be better to find some way to have the system wake up your thread when there is work to do, instead of adding tiny sleeps or yields which are a hack at best. Python's threading library provides condition variables, for example.
Kylotan
Thanks @Kylotan, the `Condition` variables is what I was looking for. And you are right, in this case tiny sleeps or waits are a hack, and hence my original question.
Carlos Rocha
Also, it was very useful to read the implementation of `Queue` and `collections.deque`.
Carlos Rocha
+1  A: 

Yielding might be useful if you're sitting tight on a loop and doing busy wait (e.g. when polling), as you constantly hint to the scheduler that you're not actually doing any work. Thread priorities do not work here as the polling thread doesn't have lower priority, but rather just has "less work to do" when busy waiting. Note that ideally you'd prefer to not schedule the polling thread at all until the event of interest actually happens, but if you're resorting to polling then it probably means you cannot do that anyway.

The GIL issue in Python is that a thread that's sitting on a loop doing busy waiting will preclude other threads from doing useful work (until it's preempted). Yielding is therefore a nice optimization in this case.

Giuliano
This doesn't seem like an answer to the original question, just a counter-argument to my answer!
Kylotan
Right, I'm sorry for that! I wanted to post this as a comment but got confused with the knobs. :-/ When I realized what I had done, it was already too late. My apologies.
Giuliano