views:

604

answers:

4

Sorry if this is a very stupid question. I am trying to use threads in a Python project I am working on, but threads don't appear to be behaving as they are supposed to in my code. It seems that all threads run sequentially (i.e. thread2 starts after thread 1 ends, they don't both start at the same time). I wrote a simple script to test this, and that too runs threads sequentially.

import threading

def something():
    for i in xrange(10):
     print "Hello"

def my_thing():
    for i in xrange(10):
     print "world" 

threading.Thread(target=something).start()
threading.Thread(target=my_thing).start()

Here's the output I get from running it:

Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
world
world
world
world
world
world
world
world
world
world

The same behavior is observed with much larger number of iterations of the loops.

I tried searching the web and older SO answers, but I couldn't find anything that helped. Can someone please point out what is wrong with this code?

+9  A: 

In the time it takes the second thread to start the first thread loops and prints already.

Here it looks like this, you can see the 2nd thread starting after the first emitted a few hellos.

Hello
Hello
Hello
Hello
Hello
Helloworld

Helloworld

Helloworld

Helloworld

Helloworld

world
world
world
world
world

Btw: Your example is not meaningful at all. The only reason for Threads is IO, and IO is slow. When you add some sleeping to simulate IO it should work as expected:

import threading
from time import sleep

def something():
    for i in xrange(10):
        sleep(0.01)
        print "Hello"

def my_thing():
    for i in xrange(10):
        sleep(0.01)
        print "world"   

threading.Thread(target=something).start()
threading.Thread(target=my_thing).start()

a wild mix appears:

worldHello

Helloworld

Helloworld

worldHello

Helloworld

Helloworld

worldHello

Helloworld

worldHello

Helloworld
THC4k
I don't get output like that even with much larger/smaller number of iterations of the for loops. On my computer it is always sequential. I think this is OS/processor dependent, as abyx suggested.
MAK
As I said in my question, this is just an example for my problem, not the code I'm working with (which is much larger). In my actual code, one of the threads runs a loop listening for dbus signals.
MAK
+3  A: 

This really depends on your Operating System's scheduler, your processor.
Other than that, it is known that CPython's threads aren't perfect because of the GIL(PDF), which, in short, means that a lot of the times threads do run sequentially, or something of that sort.

abyx
You probably mean that CPython threads suffer from the GIL… There is no GIL in, say, Jython.
EOL
@EOL - you are correct, I've updated the answer
abyx
+7  A: 

Currently in python, threads get changed after executing some specified amount of bytecode instructions. They don't run at the same time. You will only have threads executing in parallel when one of them calls some I/O-intensive or not python-affecting module that can release GIL (global interpreter lock).

I'm pretty sure you will get the output mixed up if you bump the number of loops to something like 10000. Remember that simply spawning the second thread also takes "a lot" of time.

viraptor
Same behavior with 10000 iterations
MAK
On the actual project that I'm working on, one of the threads is an infinite loop that listens for messages and calls a callback function as they arrive. It just blocks all the other threads. Unfortunately, the actual loop code cannot be modified (I just call the run() method of a class within the thread).
MAK
When I run the script like this: `./pythr.py | uniq -c` I get: 8969 Hello | 1 Hello world | 6626 world | 1 | 3373 world | 1030 Hello. So it does change the control - just not that often...
viraptor
Another way to fix this is to use `multiprocessing` instread of `threading` module. So that your code actually runs in parallel.
viraptor
I got some parallel execution on using 100,000 iterations.
MAK
Thanks. Multiprocessing solved the problem in my project code.
MAK
+3  A: 

The behaviour may also change depending on if the system is using has a single processor or multiple processors, as explained by this talk by David Beazley.

As viraptor says, the first thread will release the GIL after executing sys.getcheckinterval() bytecodes (100 by default). To crudly summarise what David Beazley says, on a single processor system the second thread will then have a chance to take over. However on a multi-core system the second thread may be running on a different core, and the first thread will try to reacquire the lock and will probably succeed since the OS will not have had time to switch processors. This means that on a multi-core system with a CPU-bound thread the other threads may never get a look in.

The way round this is to add a sleep statement to both of the loops so that they are no longer CPU bound.

Dave Kirby