views:

1773

answers:

15

First, a little background: It is no secret that I am implementing an auth+auth system for CodeIgniter, and so far I'm winning (so to speak). But I've run into a pretty non-trivial challenge (one that most auth libraries miss entirely, but I insist on handling it properly): how to deal intelligently with large-scale, distributed, variable-username brute-force attacks.

I know all the usual tricks:

  1. Limiting # of failed attempts per IP/host and denying the offenders access (e.g. Fail2Ban) - which no longer works since botnets have grown smarter
  2. Combining the above with a blacklist of known 'bad' IPs/hosts (e.g. DenyHosts) - which relies on botnets falling for #1, which they increasingly don't
  3. IP/host whitelists combined with traditional auth (sadly useless with dynamic IP users and the high churn on most web sites)
  4. Setting a sitewide limit on # of failed attempts within a N minute/hour period, and throttling (suspending) all login attempts after that for a number of minutes/hours (with the problem that DoS attacking you becomes botnet child's play)
  5. Mandatory digital signatures (public-key certificates) or RSA hardware tokens for all users with NO login/password option (without question a rock-solid solution, but only practical for closed, dedicated services)
  6. Enforced ultra-strong password schemes (e.g. >25 nonsense characters with symbols - again, too impractical for casual users)
  7. And finally, CAPTCHAs (which could work in most cases, but are annoying for users and virtually useless against a determined, resourceful attacker)

Now, these are just the theoretically workable ideas. There are plenty of rubbish ideas that blow the site wide open (e.g. to trivial DoS attacks). What I want is something better. And by better, I mean:

  • It has to be secure(+) against DoS and brute-force attacks, and not introduce any new vulnerabilities that might allow a slightly sneakier bot to continue operating under the radar

  • It has to be automated. If it requires a human operator to verify each login or monitor suspicious activity, it's not going to work in a real-world scenario

  • It has to be feasible for mainstream web use (ie. high churn, high volume, and open registration that can be performed by non-programmers)

  • It can't impede the user experience to the point where casual users will get annoyed or frustrated (and potentially abandon the site)

  • It can't involve kittens, unless they are really really secure kittens

(+) By 'secure', I mean at least as secure as a paranoid user's ability to keep his password secret

So - let's hear it! How would you do it? Do you know of a best-practice that I haven't mentioned (oh please say you do)? I admit I do have an idea of my own (combining ideas from 3 and 4), but I'll let the true experts speak before embarrassing myself ;-)

A: 

Damn, no kittens. I’m unable to offer advice, then. :(

Bombe
+2  A: 

Looks like you are trying to defend against slow distributed brute force. Not that much you can do about it. We are using a PKI and no password logins. It helps, but if your clients chance workstations every once in a while, this is not very applicable.

Actually fast brute force too. I was hoping to be somewhat lenient with fixed-user brute force (throttling just 20 seconds), but on a site with 50k users, that would make variable-user *fast* brute force possible (assuming 20+ seconds to cycle through the users). And that, as they say, would suck..
Jens Roland
Well fast brute force from a single host use iptables or whatever firewall u use.
I was referring to distributed fast brute force. It's rare but it's potentially very nasty
Jens Roland
A: 

1) What about requiring a one-time-password before entering their normal password? That would make it very obvious that someone was attacking before they got many opportunities to guess the main password?

2) Keep a global count/rate of login failures - this is the indicator for an attack - during an attack be stricter about login failures e.g. ban IPs more rapidly.

Douglas Leeder
1) How would you implement a one-time password on an unsecure, unauthenticated line? In other words, when does the user set these one-time passwords?2) Yes, that's the gist of #4 on my list, the sitewide limit on failed attempts. The downside is the DoS opportunity it opens.
Jens Roland
+8  A: 

If I understand the MO of brute force attacks properly, then one or more usernames are tried continuously.

There are two suggestions which I don't think I've seen yet here:

  • I always thought that the standard practice was to have a short delay (a second or so) after each wrong login for every user. This deters brute-force, but I don't know how long a one second delay would keep a dictionary attack at bay. (dictionary of 10,000 words == 10,000 seconds == about 3 hours. Hmm. Not good enough.)
  • instead of a site-wide slow down, why not a user-name throttle. The throttle becomes increasingly harsh with each wrong attempt (up to a limit, I guess so the real user can still login)

Edit: In response to comments on a username throttle: this is a username specific throttle without regard to the source of the attack.

If the username is throttled, then even a coordinated username attack (multi IP, single guess per IP, same username) would be caught. Individual usernames are protected by the throttle, even if the attackers are free to try another user/pass during the timeout.

From an attackers point of view, during the timeout you may be able to take a first time guess at 100 passwords, and quickly discover one wrong password per account. You may only be able to make a 50 second guesses for the same time period.

From a user account point of view, it still takes the same average number of guesses to break the password, even if the guesses are coming from multiple sources.

For the attackers, at best, it will be the same effort to break 100 accounts as it would 1 account, but since you're not throttling on a site wide basis, you can ramp up the throttle quite quickly.

Extra refinements:

  • detect IPs that are guessing multiple accounts - 408 Request Timeout
  • detect IPs that are guessing the same account - 408 Request Timeout after a large (say 100) number of guesses.

UI ideas (may not be suitable in this context), which may also refine the above:

  • if you are in control of the password setting, then showing the user how strong their password is encourages them to pick a better one.
  • if you are in control of the login page, after a small (say 10) number of guesses of a single username, offer a CAPTCHA.
jamesh
A username throttle plus an IP throttle is fine against fixed-username or fixed-IP attacks, and they do make traditional dictionary attacks infeasible. But if the attacker constantly changes usernames, he will slip by without triggering a username throttle. That's what I want to counter
Jens Roland
A breadth first attack on usernames isn't going to be very successful..
Joe Philllips
Thanks for the edit, jamesh. Now we're talking. I love the idea of the 408. However, even with strict username throttling, a botnet attacking multiple users would still work. And checking the top 5000 passwords against one user is LESS likely to succeed than checking THE top 1 password on 5000 users
Jens Roland
Nothing like the birthday paradox. In a large group, many will use insecure passwords, and one is likely to use any given popular one. There will also be a fair number of people like me who aren't going to be caught by such an attack.
David Thornley
Actually, I may have to re-check the math on my previous statement. Once you have ruled out the top N most common passwords, the probability of the user having password #(N+1) may increase enough to even out the difference. Although the curve is probably steep enough for that not to be the case
Jens Roland
+1  A: 

No matter how good your system is, it'll fail under a long enough attack. There are some good ideas here, on how to extend a password's duration. (I personally like the idea of exponentially-increasing attempt rate limiting per-user and per-IP address.) But no matter what you go with, you'll need to back it up with some password rules.

I'd encourage you to figure out how fast a password can be cracked, and have users change them twice as often as that. Hope this helps.

Edit: If you expect a lot of lazy attackers, requiring some CAPTCHA after multiple failed attempts is good: it raises the bar a little. If you're worried about a lot of intelligent attackers, hire a security consultant. ;)

ojrac
Jens Roland
This will just limit their use of it, it won't stop them from breaking it. Passwords fall in random time, not fixed time!
Loren Pechtel
That's an excellent point, Loren -
Jens Roland
Loren, you're right that security is no guarantee; the goal is to improve your odds until improving is more expensive than a breach. That's why you should get a rough number for how long a password lasts on average, and make users change the password way more often than that.
ojrac
For brute-force password attacks on multiple users, it really doesn't matter all that much if the password changes mid-attempt. Besides, the average time-to-guess may be sufficiently short that users will either refuse to change passwords or won't be online often enough.
David Thornley
+6  A: 
davethegr8
I'm sure there is more to it, but if the SiteKey idea is exactly what you mentioned, an attacker doesn't have to be a MITM, he can just run two or three login attempts for that user, and pick the image that is repeating among the random ones. Even if the set of 8-15 pictures is static for user X,
Jens Roland
(continued) it probably wouldn't be too difficult to pick the correct one, since people tend to pick predictable types of images (even images from their own Flickr albums!)
Jens Roland
But other than that, I like your ideas. They're all dangerously close to my 'annoyance' threshold though.
Jens Roland
Yeah, I thought of the point you brought up last night after I had gone home. I think the way to fix that is: When a user logs in and provides a correct password, display their image and some number of other random ones. When they do not provide the correct password, show some number of random
davethegr8
images + 1, which may or may not include their own image. Also, I had another idea, see the edit in the post. But yeah, these ideas are kinda difficult/complicated.
davethegr8
The kittens idea is flawed: any multiple choice test (pick one of N) can only slow a brute force attack down by a factor of N. For reasonable N, that's not necessarily good enough.
David Thornley
I actually thought of a different spin on the 'kittens' idea: the *first* time they login, show a random image from Flickr and ask the user to pick 3-5 'interesting' spots by clicking it, and store the image+coords. On the user's next login, show the image PLUS another random Flickr one, and
Jens Roland
(continued) ask the user to click the 3-5 'interesting' spots in *both* images. If the user picks the same (approximate) coords as his previous login, authenticate and store the new image+coords. Rinse and repeat with a new image for every login.
Jens Roland
(continued) If you could make an algorithm to translate coords into an 'approximate coord hash' (that would return the SAME hash for slightly-different coords), then you could even use that to encrypt the user's password before submitting it, effectively using the coords as a one-time pad.
Jens Roland
(continued) this 'kitten' scheme would work because Flickr is too large to be pre-calculated by bots (1 million new images/day), and each user might find different parts of each image 'interesting', and lastly because an 'interesting' point isn't well defined to an image recognition algorithm
Jens Roland
That "could" work, but I see a couple problems. What happens if the photo owner removes the image? How can you be sure that images returned won't be offensive to your user? How does a user remember where they clicked? (It seems difficult to forget)
davethegr8
The images would be stored locally, and I believe the Flickr API has some content rating in place, although we probably couldn't avoid the odd embarrassment now and then - just like CAPTCHAs sometimes say terrible things :) anyway, it was just a crazy idea, it's probably too esoteric for actual use
Jens Roland
It's an interesting idea, to be sure. I do think Flickr has content ratings, so it would be mostly safe on that front.
davethegr8
A: 

I don't believe there is a perfect answer but I would be inclined to approach it on a basis of trying to confound the robots if an attack is sensed.

Off the top of my mind:

Switch to an alternate login screen. It has multiple username and password blanks which really do appear but only one of them is in the right place. The field names are RANDOM--a session key is sent along with the login screen, the server can then find out what fields are what. Succeed or fail it's then discarded so you can't try a replay attack--if you reject the password they get a new session ID.

Any form that is submitted with data in a wrong field is assumed to be from a robot--the login fails, period, and that IP is throttled. Make sure the random field names never match the legit field names so someone using something that remembers passwords isn't mislead.

Next, how about a different sort of captcha: You have a series of questions that won't cause problems for a human. However, they are NOT random. When the attack starts everyone is given question #1. After an hour question #1 is discarded, never to be used again and everyone gets question #2 and so on.

The attacker can't probe to download the database to put into his robot because of the disposable nature of the questions. He has to send new instructions out to his botnet within an hour to have any ability to do anything.

Loren Pechtel
The alternate login screen sounds like it would confuse humans more than machines, frankly. We are of course assuming that the attacker would have checked our security measures beforehand. He could have easily tweaked his scraper to find the correctly-placed fields.
Jens Roland
The human-checking questions have been done before, and it's not very effective. For a human botnet operator to answer one question per hour (after which the new answer would propagate to the bots) during an attack would be quite feasible.
Jens Roland
You're missing the point. The attacker can't check the in advance because it only shows the extra defenses when an attack shows up.
Loren Pechtel
Sure, the human could see what the question was--but he has to communicate that to all his bots. That's a communications path that makes it easier to bring down the botnet.
Loren Pechtel
I don't think I am missing the point. I don't mean he would have run an attack previously to check our security measures, I mean he would have read this thread and checked the (open) source code to check for weknesses :)
Jens Roland
+8  A: 
Jens Roland
Maybe you could generate a 'special' password for each user that could use if in lock-down mode (and they're connecting from new IP etc), that special password being sufficiently complicated that not possible to brute-force?
Douglas Leeder
As an alternative (at login time) to the re-captcha; so that even someone who can't use re-captcha can login (in lock down conditions)?
Douglas Leeder
That could work, but only if the users remember those passwords even if they haven't used them before (these types of attack aren't commonplace, and no botmaster worth his salt would bother keeping one running for long after being throttled). The risk is too great that they simply couldn't remember.
Jens Roland
However, one method that could definitely work, is to provide a 'send me a lockdown code' link to those users, allowing them to get an email containing a single-use, user-specific token that would allow them to login, bypassing the throttling.
Jens Roland
You could further massively restrict the set of affected users by only throttling those with extremely weak/common passwords. Folks who routinely clear cookies probably have stronger passwords, thus they wouldn't be throttled. Throttle the top few thousand most common passwords.
Abtin Forouzandeh
@Abtin: Good idea, except that would be 'entering the arms race' -- ie. starting a 'who can outsmart whom' with the people who create password lists for dictionary attacks. I think a better way would be to enforce a strong password policy so there *are* no weak passwords
Jens Roland
+7  A: 

A few simple steps:

Blacklist certain common usernames, and use them as a honeypot. Admin, guest, etc... Don't let anyone create accounts with these names, so if someone does try to log them in you know it's someone doing something they shouldn't.

Make sure anyone who has real power on the site has a secure password. Require admins/ moderators to have longer passwords with a mix of letters, numbers and symbols. Reject trivially simple passwords from regular users with an explanation.

One of the simplest things you can do is tell people when someone tried to log into their account, and give them a link to report the incident if it wasn't them. A simple message when they log in like "Someone tried to log into your account at 4:20AM Wednesday blah blah. Click here if this wasn't you." It lets you keep some statistics on attacks. You can step up monitoring and security measures if you see that there's a sudden increase in fraudulent accesses.

patros
Fine thoughts. I was definitely planning to implementing an automatic password policy that varies dynamically with the user's privilege level. The honeypot idea might work for some types of attack, but if the attack is distributed, blocking the IPs that fall for it won't be effective.
Jens Roland
With respect to the 'Last attempted login time', that is a good strategy for power users (which I bet is why SO does it), but it has two weaknesses: (a) it doesn't address the problem of intrusion, it only reports that it may have happened, and (b), most user's just don't remember/care
Jens Roland
Yup, the honeypot and user reporting are more about information gathering. They may provide some valuable metrics to let you know if/when a slow brute force attack is happening.
patros
+2  A: 

To summarize Jens' scheme into a pseudo state transition diagram/rulebase:

  1. user + password -> entry
  2. user + !password -> denied
  3. user + known_IP(user) -> front door, // never throttle
  4. user + unknown_IP(user) -> catflap
  5. (#denied > n) via catflaps(site) -> throttle catflaps(site) // slow the bots
  6. catflap + throttle + password + captcha -> entry // humans still welcome
  7. catflap + throttle + password + !captcha -> denied // a correct guess from a bot

Observations:

  • Never throttle the front door. The Elbonian state police have your computer, in your house, but are unable to interrogate you. Brute force is a viable approach from your computer.
  • If you provide a "Forgetten your password?" link, then your email account becomes part of the attack surface.

These observations cover a different type of attack to the ones you are trying to counter.

jamesh
Absolutely the email account is part of the attack surface. I have a set of upper-bound assumptions on the security my strategy will provide, and the lowest bound is the user's own email security. If an attacker breaches a user's email, all bets are off.
Jens Roland
Also, I think your state transition diagram needs a couple of details: #3 and #4 should include password; #1 and #2 should include known_IP(user) since a login always has either known or unknown IP; and #6 is 'entry despite throttle'
Jens Roland
+4  A: 

I have to ask whether you've done cost-benefit analysis of this problem; it sounds like you're trying to protect yourself from an attacker who has enough web presence to guess a number of passwords, sending maybe 3-5 requests per IP (since you've dismissed IP throttling). How much (roughly) would that kind of attack cost? Is it more expensive than the value of the accounts you're trying to protect? How many gargantuan botnets want what you've got?

The answer might be no -- but if it is, I hope you're getting help from a security professional of some sort; programming skill (and StackOverflow score) do not correlate strongly to security know-how.

ojrac
(You mean to say if the answer is 'no' -- ie. that the expense of a botnet attack is NOT too high in relation to the accounts)
Jens Roland
But anyway, you bring up an important point. For my own uses, I don't expect any botnet operator to care in the least, but I am releasing the source code for anyone who would like decent security for their web app, and I can't know what others might be trying to protect, or who their enemies are
Jens Roland
It won't be guarding national secrets no matter what (official systems need special certification, and I'm fairly sure nothing built on PHP can qualify), but all web applications need secure auth, so if I'm releasing this, it'd be incredibly irresponsible not to use best practices wherever I can
Jens Roland
So my short answer is: I am building this because 99.9% of web sites and apps out there have appalling security (even in the big leagues: AOL, Twitter, Myspace have all been compromised before), and in most cases because they're using shoddy auth libraries.
Jens Roland
Also, read the paper "To Catch A Predator" by Niels Provos et al. from the 2008 USENIX proceedings (link: http://www.usenix.org/events/sec08/tech/small.html) It is an eye opener: 2 months, one honeypot: 368,000 attacks from almost 30.000 distinct IPs, coming from more than 5,600 botnets!
Jens Roland
A: 

Since several folks included CAPTCHA as a fallback human mechanism, I'm adding an earlier StackOverflow question and thread on CAPTCHA's effectiveness.

Has reCaptcha been cracked / hacked / OCR’d / defeated / broken?

Using CAPTCHA doesn't limit improvements from your throttling and other suggestions, but I think the number of answers that include CAPTCHA as a fallback should consider the human-based methods available to people looking to break security.

Matthew Glidden
+1  A: 

My highest recommendation is to simply make sure that you keep users informed of bad login attempts to their accounts-- Users will likely take the strength of their password much more seriously if they are presented with evidence that somebody is actually trying to get into their account.

I actually caught somebody that hacked into my brother's myspace account because they had tried to get into the gmail account I setup for him and used the 'reset my password by email' feature... which went to my inbox.

nvuono
+1  A: 

Disclaimer: I work for a two-factor company, but am not here to plug it. Here're some observations.

Cookies can be stolen with XSS and browser vulns. Users commonly change browsers or clear their cookies.

Source IP addresses are simultaneously dynamically variable and spoofable.

Captcha is useful, but doesn't authenticate a specific human.

Multiple methods can be combined successfully, but good taste is certainly in order.

Password complexity is good, anything password-based critically depends on passwords having sufficient entropy. IMHO, a strong password written down in a secure physical location is better than a weak password in memory. People know how to evaluate the security of paper documents much better than they know how to figure the effective entropy in their dog's name when used as a password for three different websites. Consider giving users the ability to print out a big or small page full of one-time use pass codes.

Security questions like "what was your high-school mascot" are mostly another lousy form of "something you know", most of them are easily guessable or outright in the public domain.

As you noted, throttling back failed login attempts is a trade-off between preventing brute-force attacks and ease of DoSing an account. Aggressive lockout policies may reflect a lack of confidence in password entropy.

I personally don't see the the benefit to enforcing password expiration on a website anyway. Attacker gets your password once, he can change it then and comply with that policy just as easily as you can. Perhaps one benefit is that the user might notice sooner if the attacker changes the account password. Even better would be if the the user were somehow notified before the attacker gained access. Messages like "N failed attempts since last login" are useful in this respect.

The best security comes from a second factor of authentication which is out-of-band relative to the first. Like you said, hardware tokens in the "something you have" are great, but many (not all) have real admin overhead associated with their distribution. I don't know of any biometric "something you are" solutions good for websites. Some two-factor solutions work with openid providers, some have PHP/Perl/Python SDKs.

Marsh Ray
All excellent points - I couldn't agree more. The point about cookie insecurity is very valid, but without a second factor of physical tokens or one-time passwords (distributed over a secure line) you really can't protect against a vulnerable endpoint. If the user's box/browser is compromised, so are his logins.
Jens Roland
+2  A: 
cballou