ansaurus

Question

Answer 1

+1 A:

The threading module is the place to start. As a really simple example, let's consider the problem of summing a large range by summing subranges in parallel:

import threading

class SummingThread(threading.Thread):
     def __init__(self,low,high):
         threading.Thread.__init__(self)
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         for i in range(self.low,self.high):
             self.total+=i


thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join()  # This waits until the thread has completd
thread2.join()  
# At this point, both threads have completed
result = thread1.total + thread2.total
print result

Michael Aaron Safyan 2010-05-17 04:35:11

Useless in CPython: the GIL guarantees you'll work slower this way than with a simpler approach not using subthreads. You need to get I/O waits involved for threads to make ANY sense in CPython.

Alex Martelli 2010-05-17 04:37:04

@Alex, I didn't say it was practical, but it does demonstrate how to define and spawn threads, which I think is what the OP wants.

Michael Aaron Safyan 2010-05-17 04:39:09

Answer 2

+5 A:

Here's a simple example: you need to try a few alternative URLs and return the contents of the first one to respond.

import Queue
import threading
import urllib

def geturl(q, url):
  q.put(urllib.urlfetch(url).read())

theurls = '''http://example.com/be
             http://example.de/bi
             http://example.co.uk/bo'''.split()

q = Queue.Queue()

for u in theurls:
  t = threading.Thread(target=geturl, args=(q, u))
  t.daemon = True
  t.start()

s = q.get()
print s

This is a case where threading is used as a simple optimization: each subthread is waiting for a URL to resolve and respond, in order to put its contents on the queue; each thread is a daemon (won't keep the process up if main thread ends -- that's more common than not); the main thread starts all subthreads, does a get on the queue to wait until one of them has done a put, then emits the results and terminates (which takes down any subthreads that might still be running, since they're daemon threads).

Proper use of threads in Python is invariably connected to I/O operations (since CPython doesn't use multiple cores to run CPU-bound tasks anyway, the only reason for threading is not blocking the process while there's a wait for some I/O). Queues are almost invariably the best way to farm out work to threads and/or collect the work's results, by the way, and they're intrinsically threadsafe so they save you from worrying about locks, conditions, events, semaphores, and other inter-thread coordination/communication concepts.

Alex Martelli 2010-05-17 04:36:05

ansaurus

tags:

views:

answers:

python multithreading for dummies

related questions