views:

127

answers:

3

Hello,

The problem is of a content website that is being scraped so badly that it breaks the server.

Is there an easy method of limiting access for IPs to a fixed number of requests at a time OR per day ? ( 10 pages / day or.... 10 pages every 2 minutes )

Ideally, I would keep a wildcard list for search engines and disallow everybody else from accessing content too fast or too much.

Thanks guys!

+2  A: 

One way around this, would be using IPTABLES (linux only) to prevent that individual IPs start more than a specified number of connections. It's trial and error, as you need to calculate it right, but in an overall, that should prevent the attacker's connection rate

iptables -A INPUT -p TCP --dport 80 -m state --state NEW -j STOP-ABUSE
iptables -A STOP-ABUSE -m recent --set
iptables -A STOP-ABUSE -m recent --update --seconds 10 --hitcount 3 -j DROP

Hope it helps

Marcos Placona
A: 

You can install modules such as mod_bandwidth and mod_limitipconn to limit bandwidth usage (globally and per connection).

Check http://mansurovs.com/tech/apache-bandwidth-throttling for more info.

wimvds
A: 

I would rather prefer doing that at the system level, using iptables...


But if you're looking for a solution based on Apache, an idea might be to use mod_security.

The SecGuardianLog configuration directive looks especially interesting, in your case (quoting) :

Description: Configuration directive to use the httpd-guardian script to monitor for Denial of Service (DoS) attacks.

By default httpd-guardian will defend against clients that send more than 120 requests in a minute, or more than 360 requests in five minutes.

Pascal MARTIN