views:

55

answers:

3

I have a web application where I need to keep track of "most popular" (most visited) articles. Most of the pages (including article pages) in that application displays "most popular" list in sidebar, so this list will be retrieved very often. On the other hand, articles are visited quite often too (approximately 1/2 of page visits are visits to article pages).

What is the best way to keep track of visits and be able to select N most visited articles? As I understand, it should be a concurrent map articleId->visitCount that is sorted by values (visitCounts) and where I can quickly (and threadsafely) increment visitCount and expect map to re-sort itself.

+2  A: 

For a web application, the best place to store this would be in the database. Create a database with a field for article ID and a field for the number of visits. Index the table by the number of visits. Whenever an article is viewed, add the record or increment the existing record. When you need to see the list of most popular, just query the table.

Databases are frequently the best answer for where to store data in a web application.

In this case, the database will index the table based on the number of visits. This makes it a bit slower for inserting and updating, but databases are designed to do this work so it won't be too bad. Retrieval of this data will always be super fast because of the maintained index.

Erick Robertson
+1  A: 

If you don't want to use a database, then you could use a SortedSet to store objects that contained both the article ID and the visit count. Comparison of the objects would be on the visit count. Implementation could include TreeSet, which would have to be externally synchronized in a multi-threaded environment, and ConcurrentSkipListSet, which would not have to be externally synchronized.

Steve Emmerson
Most SortedSet instances where the comparison of elements was not transitive (due to the fact that the object that is being compared changes) would cause some _very strange bugs_. How would you handle the case where a > b > a ???
Jed Wesley-Smith
@Steve, do you mean to remove and re-add an entry for every update?
Shooshpanchick
@Shooshpanchick Obviously; otherwise, it wouldn't work.
Steve Emmerson
A: 

Personally, I wouldn't try and answer this during an update. You are far more likely to update your structure each visit than read it.

When it comes time to read, make a copy of each id,visit# entry and then sort that for display. You'd be surprised how cheap it is.

Jed Wesley-Smith
There may be over a million articles, so sorting them each time is not an option, unfortunately.
Shooshpanchick