views:

114

answers:

3

Hello SO, this question relates to performance penalities that may or may not arise from having a large number of sleeping python threads on a webserver.

Background: I am implementing an online shop using django/satchmo. A requirement is for delayed payment. The customer can reserve a product and allow a third party to pay for it at a later date (via a random and unique URL).

To handle unreserving an item I am creating a thread which will sleep for the reservation time and then remove the reservation/mark the product as sold when it awakes. It looks like this:

#Reserves a product when it is placed in the cart
def reserve_cart_product(product):
  log.debug("Reserving %s" % product.name)
  product.active = False
  product.featured = False
  product.save()
  from threading import Timer
  Timer(CART_RESERVE_TIME, check_reservation, (product,)).start()

I am using the same technique when culling the unique URLs after they have expired, only the Timer sleeps for much longer (typically 5 days).

So, my question to you SO is as follows:

Is having a large numnber of sleeping threads going to seriously effect performance? Are there better techniques for scheduling a one off event sometime in the future. I would like to keep this in python if possible; no calling at or cron via sys.

The site isn't exactly high traffic; a (generous) upper limit on products ordered per week would be around 100. Combined with cart reservation, this could mean there are 100+ sleeping threads at any one time. Will I regret scheduling tasks in this manner?

Thanks

+5  A: 

I see no reason why this shouldn't work. The underlying code for Timer (in threading.py) simply uses time.sleep. Once it's been waiting for awhile, it basically runs a loop with time.sleep(0.05) This should result in CPU usage of basically 0%, even with hundreds of threads. Here's a simple example, where I noticed 0% cpu usage for the python process:

import threading

def nothing():
    pass

def testThreads():
    timers = [threading.Timer(10.0, nothing) for _ in xrange(881)]
    print "Starting threads."
    map(threading.Thread.start, timers)
    print "Joining threads."
    map(threading.Thread.join, timers)
    print "Done."

if __name__ == "__main__":
    testThreads()

The real issue is that you may not be able to actually start too many threads. On my 64-bit 4GB system, I can only start 881 threads before I get an error. If you really will only have a few hundred, though, I can't imagine it won't work.

Daniel G
+3  A: 

Usually sleeping thread have no overhead, aside of the memory allocated for their stacks and other private data. Modern operation systems scheduling algorithms have complexity O(1) so even running thread does not introduce overhead more then memory footprint. In the same time it is hard to imaginve efficient design required a lot of threads. Only case I can imagine is communication with many other peers. In this case - asynchronous IO should be used.

David Gruzman
+2  A: 

100 threads is no problem, but as tgray pointed out, what happens if the server goes down? Powercut, Planned Maintenance, Hardware failure, etc.

You need to store the unreservation information in your database somewhere.

Then you can have a cron job periodically trigger an unreservation script for example, and you don't need to have all those threads sitting around.

If you really don't want to use cron, just have one worker thread that sleeps for a minute and then checks whether any of the unreservations are due.

gnibbler