ansaurus

Question

How can I fix this multithreaded Python script?

Answer 1

+1 A:

It looks like you are trying to start too many threads.

You can check how many items are in [address.strip() for address in intext if address.strip()] list. I quess this is a problem here. Basically there is a limit of available resources that allows to start new threads.

The solution for this is to chunk your list to pieces of let's say 20 elements, do the stuff (in 20 threads), wait for threads to finish their jobs, and then pick up next chunk. Do this until all elements from your list are processed.

You can also use some thread pool for better threads management. (I recently used this implementation).

Lukasz Dziedzia 2010-06-26 00:34:29

Sounds like a good idea. Thanks

Tom 2010-06-26 00:53:21

Glad I could help you

Lukasz Dziedzia 2010-06-26 17:10:13

Answer 2

+1 A:

There's probably an upper limit to the number of threads you can create, and you're probably exceeding it.

Suggestion: Create a small, fixed number of Resolvers - under 10 will probably get you 90% of the possible parallelism benefit possible - and a (threadsafe) Queue from python's queue lib. Have the main thread dump all the domains into the queue, and have each Resolver take one domain at a time from the queue and work on it.

Russell Borogove 2010-06-26 00:36:03

ansaurus

tags:

views:

answers:

How can I fix this multithreaded Python script?

related questions