tags:

views:

241

answers:

2

Hi,

I was asked to design an algorithm to calculate most user viewed pages. I answered him that we can make use of counter but that was not an efficient algorithm.

What would be more efficient algorithm to calculate the most user viewed pages.

Thanks.

+3  A: 

Just set up a professional stats solution such as Google Analytics and don't spin your wheels on this type of thing. Focus on your core business.

Asaph
@Asaph: They create software used for Tracking Website traffic and so they asked about it.
Rachel
@Rachel My bad. So this *is* your core business. I guess I shouldn't have assumed otherwise. Well, knowing that, I would suggest an offline process that analyzes web server access logs instead of a realtime hit counter because the realtime hit counter you've proposed would require locking to ensure accuracy and that's a performance overhead you don't want.
Asaph
@Asaph: Thank you Asaph for the information, so we need to have an different process which just check web server access logs, will this approach not add an overhead as we increase frequency of checking webserver access logs.
Rachel
@Rachel Churning through web server logs can be expensive and that's why you want to have it happen on a different machine than your web server or database server. One thing that helps is looking at your log format and see if you can trim it down at all, removing fields that you won't be needing.
Asaph
@Rachel Also, make sure to ship your log files to your log processing server so they can be accessed locally on that box. Remote access of the log files will yield poor performance.
Asaph
@Asaph: Thank you for the useful information.
Rachel
+2  A: 

Counters create very high contention in the database. Parsing the apache/iis logs and building aggregates is one simple method to build usage stats, but requires extensive request logging and cumbersome log parsers, also might not capture all info. Queueing counter updates on the other hand is reliable, fairly simple to implement (once a queuing infrastructure is in place) and scales well.

Remus Rusanu