tags:

views:

252

answers:

3

Is it possible to ban certain user agents directly from web.config? Certain robots seem not to follow robots.txt, and to avoid pointless server load (and log-file spamming) I'd like to prevent certain classes of request (in particular based on user-agent or very perhaps IP-address) from proceeding.

Bonus points if you know if it's similarly possible to prevent such requests from being logged to IIS's log-file entirely. (i.e. if-request-match, forward to /dev/null, if you get my meaning).

A solution for win2003 would be preferable, but this is a recurring problem - if there's a clean solution for IIS7 but not IIS6, I'd be happy to know it.

Edit: Sorry 'bout the incomplete question earlier, I had tab+entered accidentally.

A: 

Don't think you can do this from web.config (authorisation in web.config is for users, not bots). Your best bet would be some kind of custom ISAPI filter for IIS itself. There's a blog about this here. Good luck!

Dan Diplo
+2  A: 

This can be done pretty easily using the URLRewrite module in IIS7. But I really don't know if this will prevent those requests from being logged.

 <rewrite> 
  <rules> 
    <rule name="Ban user-agent RogueBot" stopProcessing="true"> 
      <match url=".*" /> 
      <conditions> 
        <add input="{HTTP_USER_AGENT}" pattern="RogueBotName" /> 
        <add input="{MyPrivatePages:{REQUEST_URI}}" pattern="(.+)" /> 
      </conditions> 
      <action type="AbortRequest" /> 
    </rule> 
  </rules> 
  <rewriteMaps> 
    <rewriteMap name="MyPrivatePages"> 
      <add key="/PrivatePage1.aspx" value="block" /> 
      <add key="/PrivatePage2.aspx" value="block" />
      <add key="/PrivatePage3.aspx" value="block" /> 
    </rewriteMap> 
  </rewriteMaps> 
</rewrite>
Albert Walker
Well, the site is small enough such that the IIS log isn't a perf. problem; it's mostly just noise I wouldn't mind avoiding - but this solution is exactly what I was hoping for - some configurable module that can abort certain requests. I'll look into it, thanks!
Eamon Nerbonne
+1  A: 

You could write a custom ASP.Net HttpModule as I did for my site to ban some rogue bots. Here's the code:

public class UserAgentBasedRedirecter : IHttpModule
{
    private static readonly Regex _bannedUserAgentsRegex = null;
    private static readonly string _bannedAgentsRedirectUrl = null;

    static UserAgentBasedRedirecter()
    {
            _bannedAgentsRedirectUrl = ConfigurationManager.AppSettings["UserAgentBasedRedirecter.RedirectUrl"];
            if (String.IsNullOrEmpty(_bannedAgentsRedirectUrl))
                _bannedAgentsRedirectUrl = "~/Does/Not/Exist.html";

            string regex = ConfigurationManager.AppSettings["UserAgentBasedRedirecter.UserAgentsRegex"];
            if (!String.IsNullOrEmpty(regex))
                _bannedUserAgentsRegex = new Regex(regex, RegexOptions.IgnoreCase | RegexOptions.Compiled);
    }

    #region Implementation of IHttpModule

    public void Init(HttpApplication context)
    {
            context.PreRequestHandlerExecute += RedirectMatchedUserAgents;
    }

    private static void RedirectMatchedUserAgents(object sender, System.EventArgs e)
    {
            HttpApplication app = sender as HttpApplication;

            if (_bannedUserAgentsRegex != null &&
                app != null && app.Request != null && !String.IsNullOrEmpty(app.Request.UserAgent))
            {
                if (_bannedUserAgentsRegex.Match(app.Request.UserAgent).Success)
                {
                    app.Response.Redirect(_bannedAgentsRedirectUrl);
                }
            }
    }

    public void Dispose()
    { }

    #endregion
}

You'll need to register it in web.config and specify the regular expression to use to match user agent strings. Here's one I used to ban msnbot/1.1 traffic:

<configuration> 
 <appSettings>
  <add key="UserAgentBasedRedirecter.UserAgentsRegex" value="^msnbot/1.1" />
 </appSettings>
...
 <system.web>
  <httpModules>
   <add name="UserAgentBasedRedirecter" type="Andies.Web.Traffic.UserAgentBasedRedirecter, Andies.Web" />
  </httpModules>
 </system.web>
</configuration>
Andrew Smith
This looks even more like what I was looking for :-) thanks! Do you happen to know if this prevents requests from being logged? Probably not, right?
Eamon Nerbonne
Haven't checked, but I would imagine that seeing as this has already gone through the ASP.Net pipeline, it's already in the logs
Andrew Smith