It seem what i can do fast crawler with python in two ways:
thread pool with block sockets
non block sockets select,asyncore,etc..
i thnk where is no real need in thread here, and solution #2 better.
which is better and why?
It seem what i can do fast crawler with python in two ways:
thread pool with block sockets
non block sockets select,asyncore,etc..
i thnk where is no real need in thread here, and solution #2 better.
which is better and why?
Twisted is usually preferred to asyncore. It is an asynchronous I/O framework that can also work with thread pools.
In Python, you should prefer asynchronous IO to threads, simply because threads are a second class citizen in its canonical implementation (CPython) due to GIL.