views:

135

answers:

2

trying to find a simple example that clearly shows a single task being divided for multi-threading.

Quite frankly... many of the examples are overly sophisticated thus.... making the flow tougher to play with...

anyone care to share their breakthrough sample or point to an example?

As well, what is the best docs? many google lookups are too specific (for me at this stage)

Thanks in advance.

+1  A: 

The threading module is the place to start. As a really simple example, let's consider the problem of summing a large range by summing subranges in parallel:

import threading

class SummingThread(threading.Thread):
     def __init__(self,low,high):
         threading.Thread.__init__(self)
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         for i in range(self.low,self.high):
             self.total+=i


thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join()  # This waits until the thread has completd
thread2.join()  
# At this point, both threads have completed
result = thread1.total + thread2.total
print result
Michael Aaron Safyan
Useless in CPython: the GIL guarantees you'll work slower this way than with a simpler approach not using subthreads. You need to get I/O waits involved for threads to make ANY sense in CPython.
Alex Martelli
@Alex, I didn't say it was practical, but it does demonstrate how to define and spawn threads, which I think is what the OP wants.
Michael Aaron Safyan
+5  A: 

Here's a simple example: you need to try a few alternative URLs and return the contents of the first one to respond.

import Queue
import threading
import urllib

def geturl(q, url):
  q.put(urllib.urlfetch(url).read())

theurls = '''http://example.com/be
             http://example.de/bi
             http://example.co.uk/bo'''.split()

q = Queue.Queue()

for u in theurls:
  t = threading.Thread(target=geturl, args=(q, u))
  t.daemon = True
  t.start()

s = q.get()
print s

This is a case where threading is used as a simple optimization: each subthread is waiting for a URL to resolve and respond, in order to put its contents on the queue; each thread is a daemon (won't keep the process up if main thread ends -- that's more common than not); the main thread starts all subthreads, does a get on the queue to wait until one of them has done a put, then emits the results and terminates (which takes down any subthreads that might still be running, since they're daemon threads).

Proper use of threads in Python is invariably connected to I/O operations (since CPython doesn't use multiple cores to run CPU-bound tasks anyway, the only reason for threading is not blocking the process while there's a wait for some I/O). Queues are almost invariably the best way to farm out work to threads and/or collect the work's results, by the way, and they're intrinsically threadsafe so they save you from worrying about locks, conditions, events, semaphores, and other inter-thread coordination/communication concepts.

Alex Martelli