views:

591

answers:

3

I already looked at the most popular Django hit counter solutions and none of them seem to solve the issue of spamming the refresh button.

Do I really have to log the IP of every visitor to keep them from artificially boosting page view counts by spamming the refresh button (or writing a quick and dirty script to do it for them)?

More information

So right now you can inflate your view count with the following few lines of Python code. Which is so little that you don't really need to write a script, you could just type it into an interactive session:

from urllib import urlopen

num_of_times_to_hit_page = 100
url_of_the_page = "http://example.com"

for x in range(num_of_times_to_hit_page):
    urlopen(url_of_the_page)

Solution I'll probably use

To me, it's a pretty rough situation when you need to do a bunch of writes to the database on EVERY page view, but I guess it can't be helped. I'm going to implement IP logging due to several users artificially inflating their view count. It's not that they're bad people or even bad users.

See the answer about solving the problem with caching... I'm going to pursue that route first. Will update with results.

For what it's worth, it seems Stack Overflow is using cookies (I can't increment my own view count, but it increased when I visited the site in another browser.)

I think that the benefit is just too much, and this sort of 'cheating' is just too easy right now.

Thanks for the help everyone!

+1  A: 

You could send them a cookie when they access it and then check for that cookie. It can still be gamed, but it's a bit harder.

scompt.com
+7  A: 

There is no foolproof way of preventing someone from artificially inflating a count. Rather, there's the extent to which you're willing to spend time making it more difficult for them to do so:

  • Not at all (they click refresh button)
  • Set a cookie, check cookie to see if they were already there (they clear cookies)
  • Log IP addresses (the fake a different IP every time)
  • Require signin with an email they respond from (they sign up for multiple email accounts)

So, in the end, you just need to pick the level of effort you want to go to in order to prevent that users from abusing the system.

RHSeeger
well right now you can inflate your view count with a 4 line python script :(
Jiaaro
from urllib import urlopen; for x in range(100): urlopen('http://addressofpage.com')
Jiaaro
make that a 2 line python script :(
Jiaaro
+3  A: 

Logging an IP is probably the safest. It's not perfect, but it's better than cookies and less annoying to users than requiring a signup. That said, I'd recommend not bothering with saving these in a DB. Instead, use Django's low-level caching framework. The key would be the ip and the value a simple boolean. Even a file-based cache should be pretty fast, though go with memchached as the cache backend if you really expect heavy traffic.

Something like this should work:

ip = request.META['REMOTE_ADDR']
has_voted = cache.get(ip)
if not has_voted:
    cache.set(ip, True)
    #code to save vote goes here
mazelife
I should add that you'll want to set the cache TTL to something sensible, like an hour: cache.set(ip, True, 60 * 60)
mazelife
Thanks... this is much nicer and simpler than the big slow db ip logging I thought I was going to have to do... I'll have to remember not to be stupid in the future ;)
Jiaaro