Was looking to write a little web crawler in python. I was starting to investigate writing it as a multithreaded script, one pool of threads downloading and one pool processing results. Due to the GIL would it actually do simultaneous downloading? How does the GIL affect a web crawler? Would each thread pick some data off the socket, then move on to the next thread, let it pick some data off the socket, etc..?
Basically I'm asking is doing a multi-threaded crawler in python really going to buy me much performance vs single threaded?
thanks!