views:

413

answers:

5

What is a good approach to keeping accurate counts of how many times a page has been viewed

I'm using Django. Specifically, I don't want refreshing the page to up the count.

+1  A: 

As far as I'm aware, no browsers out there at the moment send any kind of message/header to the server saying whether the request was from a refresh or not.

The only way I can see to not count a user refreshing the page is to track the IPs and times that a user views a page, and then if the user last viewed the page less than 30 minutes ago, say, you would dismiss it as a refresh and not increment the page view count.

IMO most page refreshes should be counted as a page view anyway, as the only reason I have for refreshing is to see new data that might have been added, or the occasional accidental refresh/reloading after a browser crash (which the above method would dismiss).

henrym
Tracking the IPs is bad idea. Many networks give the same outer IP for its' users.
Roman
Yeah, some other method for tracking unique users would be better (cookies, IP+User Agent), but the idea is to dismiss page views that you think (as you can't reliably know) are from the user refreshing the page.
henrym
A: 

You could give each user cookie, that expires at the end of the day, containing a unique number. If he reloads a page you can check wether she has been counted already that day.

Lenni
A: 

You could create a table with unique visitors of the pages, e.g. VisitorIP + X-Forwarded-For content, User-Agent string along with a PageID of some sorts. If the data itself is irrelevant, you can create a md5/sha1 hash from these values (besides the PageID of course). Be warned however that this table will grow really fast.

I'd advise against setting cookies for that purpose. They have a limited size and with many visited pages by the user, you could reach that limit and make the solution unreliable. Also it makes it harder to cache such page on client-side (see Cacheability), since it becomes interactive content.

macbirdie
A: 

You can write a django middleware and catch request.url, then setup a table with url / accesses columns. Beware of transactions for concurrent update. If you have load problems, you can use memcached with incr or add function and periodicaly update the database table to avoid transaction locks.

fredz
A: 

I use Analytics

Andre Bossard