views:

97

answers:

4

I assume it's bots, or something like them. We have forums on our website and daily we get 1000's of attempts to post spam. These never actually make it into the database, usually by throwing a ViewState or EventValidation exception. I'm not sure if I should even really be concerned. I'd really like to do something about these bots. Not only are they wasting our resources but it's more than a little annoying trying to sift through all these errors to find the real errors. Any suggestions?

A: 

I believe CAPCHA was designed to fix this problem.

sheepsimulator
captcha just lets me know if it's a human or not...it doesn't actually keep the data from being sent to the webserver no?
Al W
Ah... I see what you're getting at. You want to cut off any possibility of any bots even sending you info. Sorry, misunderstood your question.
sheepsimulator
A: 

Best solution depends on the popularity (number of users) on your forum.

Most forum software have plug-ins for Captcha and related technologies. This is what you want for a large site. For a small site you can cheat by simply adding some random question to the submission form like "Are you human?" If they don't type "yes" in the input box they don't pass your (Turing?) test. Most spammers don't actually visit your site, they simply run scripts looking for known forum software or obvious comment forms.

In response to your last comment you can't stop an actual human spamming your site (even denying links is not enough). You certainly can't stop anyone sending you data without turning off your website. You should simply have moderators to remove any spam that gets through your captcha.

SpliFF
+2  A: 

It sounds like this isn't a content problem. Users don't see the spam, because the vast majority of the submissions are somehow mis-formatted. You've got a couple of options, depending on the control you have over your software:

  1. If you wrote the forum software, or are able/comfortable with modifying it, you could catch the most common exceptions that these broken submissions throw. (It sounds like you've already identified those exceptions.) You could write those exceptions to a different "spam log" or some such, which would allow you to do stats and reporting down the road.
  2. Using either the data from your spam log, or maybe even what you have currently logging, you could identify IPs or ranges that often send these bad submissions and block them at your firewall. If this is realy spamming, though, chances are that they have ways to get around it, since that's a pretty basic spam-blocking strategy.
  3. It's also possible that this isn't spam, but instead it's a bad browser. If you could add User-Agent information to the exception/spam logs, you might be able to trace that. You might get lucky and it might turn out that fixing your forms for IE5Mac or Mini Opera or something like that would not only prevent these exceptions but also bump your visitor numbers.

Unless these submissions are making a measurable impact on the performance of your site, I don't think there's going much use in doing a lot more than that. Adding CAPTCHAs wouldn't prevent spam from being submitted, just from being successfully submitted (which it doesn't sound like is a problem right now). The only thing that's worth your time at this point is breaking the bad submissions into a separate log.

Plutor
A: 

You could look at your webserver's log files and see what type of 'User Agent' those connections are coming from. Browsers such as IE/Firefox have a User Agent signature of something along the lines of 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)'

The bots will likely have a (few) specific User Agent signature(s), you could add these to a black list in your server's configuration files so that your server will just ignore requests from them.

Also you should take a minute to read through

http://www.kloth.net/internet/bottrap.php

instanceofTom