First, let me recap what we need to do here. I realize I'm just paraphrasing the original question, but it's important that we get this 100% straight, because there are a lot of great suggestions that get 2 or 3 out of 4 right, but as I will demonstrate, you will need a multifaceted approach to cover all of the requirements.
Requirement 1: Getting rid of the 'bot slamming':
The rapid-fire 'slamming' of your front page is hurting your site's performance and is at the core of the problem. The 'slamming' comes from both single-IP bots and - supposedly - from botnets as well. We want to get rid of both.
Requirement 2: Don't mess with the user experience:
We could fix the bot situation pretty effectively by implementing a nasty verification procedure like phoning a human operator, solving a bunch of CAPTCHAs, or similar, but that would be like forcing every innocent airplane passenger to jump through crazy security hoops just for the slim chance of catching the very stupidest of terrorists. Oh wait - we actually do that. But let's see if we can not do that on woot.com.
Requirement 3: Avoiding the 'arms race':
As you mention, you don't want to get caught up in the spambot arms race. So you can't use simple tweaks like hidden or jumbled form fields, math questions, etc., since they are essentially obscurity measures that can be trivially autodetected and circumvented.
Requirement 4: Thwarting 'alarm' bots:
This may be the most difficult of your requirements. Even if we can make an effective human-verification challenge, bots could still poll your front page and alert the scripter when there is a new offer. We want to make those bots infeasible as well. This is a stronger version of the first requirement, since not only can't the bots issue performance-damaging rapid-fire requests -- they can't even issue enough repeated requests to send an 'alarm' to the scripter in time to win the offer.
Okay, so let's se if we can meet all four requirements. First, as I mentioned, no one measure is going to do the trick. You will have to combine a couple of tricks to achieve it, and you will have to swallow two annoyances:
- A small number of users will be required to jump through hoops
- A small number of users will be unable to get the special offers
I realize these are annoying, but if we can make the 'small' number small enough, I hope you will agree the positives outweigh the negatives.
First measure: User-based throttling:
This one is a no-brainer, and I'm sure you do it already. If a user is logged in, and keeps refreshing 600 times a second (or something), you stop responding and tell him to cool it. In fact, you probably throttle his requests significantly sooner than that, but you get the idea. This way, a logged-in bot will get banned/throttled as soon as it starts polling your site. This is the easy part. The unauthenticated bots are our real problem, so on to them:
Second measure: Some form of IP throttling, as suggested by nearly everyone:
No matter what, you will have to do some IP based throttling to thwart the 'bot slamming'. Since it seems important to you to allow unauthenticated (non-logged-in) visitors to get the special offers, you only have IPs to go by initially, and although they're not perfect, they do work against single-IP bots. Botnets are a different beast, but I'll come back to those. For now, we will do some simple throttling to beat rapid-fire single-IP bots.
The performance hit is negligable if you run the IP check before all other processing, use a proxy server for the throttling logic, and store the IPs in a memcached lookup-optimized tree structure.
Third measure: Cloaking the throttle with cached responses:
With rapid-fire single-IP bots throttled, we still have to address slow single-IP bots, ie. bots that are specifically tweaked to 'fly under the radar' by spacing requests slightly further apart than the throttling prevents.
To instantly render slow single-IP bots useless, simply use the strategy suggested by abelenky: serve 10-minute-old cached pages to all IPs that have been spotted in the last 24 hours (or so). That way, every IP gets one 'chance' per day/hour/week (depending on the period you choose), and there will be no visible annoyance to real users who are just hitting 'reload', except that they don't win the offer.
The beauty of this measure is that is also thwarts 'alarm bots', as long as they don't originate from a botnet.
(I know you would probably prefer it if real users were allowed to refresh over and over, but there is no way to tell a refresh-spamming human from a request-spamming bot apart without a CAPTCHA or similar)
Fourth measure: reCAPTCHA:
You are right that CAPTCHAs hurt the user experience and should be avoided. However, in *one* situation they can be your best friend: If you've designed a very restrictive system to thwart bots, that - because of its restrictiveness - also catches a number of false positives; then a CAPTCHA served as a last resort will allow those real users who get caught to slip by your throttling (thus avoiding annoying DoS situations).
The sweet spot, of course, is when ALL the bots get caught in your net, while extremely few real users get bothered by the CAPTCHA.
If you, when serving up the 10-minute-old cached pages, also offer an alternative, optional, CAPTCHA-verified 'front page refresher', then humans who really want to keep refreshing, can still do so without getting the old cached page, but at the cost of having to solve a CAPTCHA for each refresh. That is an annoyance, but an optional one just for the die-hard users, who tend to be more forgiving because they know they're gaming the system to improve their chances, and that improved chances don't come free.
Fifth measure: Decoy crap:
Christopher Mahan had an idea that I rather liked, but I would put a different spin on it. Every time you are preparing a new offer, prepare two other 'offers' as well, that no human would pick, like a 12mm wingnut for $20. When the offer appears on the front page, put all three 'offers' in the same picture, with numbers corresponding to each offer. When the user/bot actually goes on to order the item, they will have to pick (a radio button) which offer they want, and since most bots would merely be guessing, in two out of three cases, the bots would be buying worthless junk.
Naturally, this doesn't address 'alarm bots', and there is a (slim) chance that someone could build a bot that was able to pick the correct item. However, the risk of accidentally buying junk should make scripters turn entirely from the fully automated bots.
Sixth measure: Botnet Throttling:
[deleted]
Okay............ I've now spent most of my evening thinking about this, trying different approaches.... global delays.... cookie-based tokens.. queued serving... 'stranger throttling'.... And it just doesn't work. It doesn't. I realized the main reason why you hadn't accepted any answer yet was that noone had proposed a way to thwart a distributed/zombie net/botnet attack.... so I really wanted to crack it. I believe I cracked the botnet problem for authentication in a different thread, so I had high hopes for your problem as well. But my approach doesn't translate to this. You only have IPs to go by, and a large enough botnet doesn't reveal itself in any analysis based on IP addresses.
So there you have it: My sixth measure is naught. Nothing. Zip. Unless the botnet is small and/or fast enough to get caught in the usual IP throttle, I don't see any effective measure against botnets that doesn't involve explicit human-verification such as CAPTHAs. I'm sorry, but I think combining the above five measures is your best bet. And you could probably do just fine with just abelenky's 10-minute-caching trick alone.