This was the closest question to my question and it wasn't really answered very well imo:
http://stackoverflow.com/questions/2022030/web-scraping-etiquette
I'm looking for the answer to #1:
How many requests/second should you be doing to scrape?
Right now I pull from a queue of links. Every site that gets scraped has it's own thread and sleeps for 1 second in between requests. I ask for gzip compression to save bandwidth.
Are there standards for this? Surely all the big search engines have some set of guidelines they follow in regards to this.