views:

120

answers:

4

I am looking for a way to ease my threaded code.

There are a lot of places in my code where I do something like:

for arg in array:
   t=Thread(lambda:myFunction(arg))
   t.start()

i.e running the same function, each time for different parameters, in threads.

This is of course a simplified version of the real code, and usually the code inside the for loop is ~10-20 lines long, that cannot be made simple by using one auxiliary function like myFunction in the example above (had that been the case, I could've just used a thread pool).

Also, this scenario is very, very common in my code, so there are tons of lines which I consider redundant. It would help me a lot if I didn't need to handle all this boilerplate code, but instead be able to do something like:

for arg in array:
      with threaded():
          myFunction(arg)

i.e somehow threaded() takes every line of code inside it and runs it in a separate thread.

I know that context managers aren't supposed to be used in such situations, that it's probably a bad idea and will require an ugly hack, but nonetheless - can it be done, and how?

A: 

Would a thread pool help you here? Many implementations for Python exist, for example this one.


P.S: still interested to know what your exact use-case is

Eli Bendersky
I don't see anything in his question indicating a GIL-laden implementation of Python.
Nicholas Knight
A: 

What you want is a kind of "contextual thread pool".

Take a look at the ThreadPool class in this module, designed to be used similar to the manner you've given. Use would be something like this:

with ThreadPool() as pool:
    for arg in array:
        pool.add_thread(target=myFunction, args=[arg])

Failures in any task given to a ThreadPool will flag an error, and perform the standard error backtrace handling.

Matt Joiner
Note of course that you adjust this to your syntactic taste, but I quite like the standard `threading.Thread` interface, and adapted it.
Matt Joiner
Thanks, but it still requires me to explicitly define `myFunction`, which is the sort of clutter I am exactly trying to avoid... It was my fault, should've stressed it out more in my question. I'll edit it.
noam
A: 

I think you're over-complicating it. This is the "pattern" I use:

# util.py
def start_thread(func, *args):
    thread = threading.Thread(target=func, args=args)
    thread.setDaemon(True)
    thread.start()
    return thread

# in another module
import util
...
for arg in array:
    util.start_thread(myFunction, arg)

I don't see the big deal about having to create myFunction. You could even define the function inline with the function that starts it.

def do_stuff():
    def thread_main(arg):
        print "I'm a new thread with arg=%s" % arg
    for arg in array:
        util.start_thread(thread_main, arg)

If you're creating a large number of threads, a thread pool definitely makes more sense. You can easily make your own with the Queue and threading modules. Basically create a jobs queue, create N worker threads, give each thread a "pointer" to the queue and have them pull jobs from the queue and process them.

FogleBird
+1  A: 

How about this:

for arg in array:
    def _thread():
        # code here
        print arg

    t = Thread(_thread)
    t.start()

additionally, with decorators, you can sugar it up a little:

def spawn_thread(func):
    t = Thread(func)
    t.start()
    return t

for arg in array:
    @spawn_thread
    def _thread():
        # code here
        print arg
Lie Ryan
pThis almost convinced me, but the problem is the scope of your args. if you do for i in range(10): @spawn_thread def f(): sleep(1) print iyou get `9 9 9 9 9 9 9 9 9 9 9` as your output
noam