I recently solved this same problem. One thing to consider is that some people don't take kindly to having their servers bogged down and will block an IP address that does so. The standard courtesy that I've heard is about 3 seconds between page requests, but this is flexible.
If you are downloading from multiple websites you can group your URLs by domain and create one thread per. Then in your thread you can do something like this:
for url in urls:
timer = time.time()
# ... get your content ...
# perhaps put content in a queue to be written back to
# your database if it doesn't allow concurrent writes.
while time.time() - timer < 3.0:
time.sleep(0.5)
Sometimes just getting your response will take the full 3 seconds and you don't have to worry about it.
Granted, this won't help you at all if you're only downloading from one site, but it may keep you from getting blocked.
My machine handles about 200 threads before the overhead of managing them slowed the process down. I ended up at something like 40-50 pages per second.