views:

90

answers:

2

The page view counter on each MediaWiki page seems like a great way to identify popular pages which are worth putting more effort into keeping up-to-date and useful, but I've hit a problem.

We use a Google Search Appliance to index our MediaWiki installation. The problem I have is that the GSA increments the page view counter each time it crawls the page. This completely dominates the statistics, swamping the views made by real users.

I know how to reset the page counters to start again. But is there a way to configure MediaWiki to ignore page requests from the GSA for the purposes of counting page views?

+2  A: 

hi, this can be done by adding a condition in Article.php:

includes/Article.php:2861:function viewUpdates():

if( !$wgDisableCounters && !$wgUser->isAllowed('bot') && $this->getID() ) {

add:

&& strpos($_SERVER['HTTP_USER_AGENT'], 'gsa-crawler') === false

where gsa-crawler is part of the default gsa UA...

another way is to setup Forms Authentication in GSA, and have it login to wikimedia as a user in the bot group..

jspcal
Works perfectly! Note that you don't have to hack the code directly - I added this extra condition to LocalSettings.php, so it's maintainable across version upgrades.
ire_and_curses
+1  A: 

We added this snippet to LocalSettings.php, with great success:

if (strpos($_SERVER['HTTP_USER_AGENT'], 'gsa-crawler') !== FALSE) {
  $wgDisableCounters = TRUE;
}

Thanks!