views:

91

answers:

1

With reference to this post: http://stackoverflow.com/questions/2122546/how-to-implement-tag-counting

I have implemented the suggested 3 table tagging system completely. To count the number of Articles per tag, i am using another column named tagArticleCount in the tag definition table. (other columns are tagId, tagText, tagUrl, tagArticleCount).

If i implement realtime editing of this table, so that whenever user adds another tag to article or deletes an existing tag, the tag_definition_table is updated to update the counter of the added/removed tag. This will cost an extra query each time any modification is made. (at the same time, related link entry for tag and article is deleted from tagLinkTable).

An alternative to this is not allowing any real time editing to the counter, instead use CRONs to update counter of each tag after a specified time period. Here comes the problem that i want to discuss. This can be seen as caching the article count in database. Can you please help me find a way to present the articles in a list when a tag is explored and when the article counter for that tag is not up to date. For example: 1. Counter shows 50 articles, but there are infact 55 entries in the tag link table (that links tags and articles). 2. Counter shows 50 articles, but there are infact 45 extries in the tag link table.

How to handle these 2 scenerios given in example. I am going to use APC to keep cache of these counters. Consider it too in your solution. Also discuss performance in the realtime / CRONNED counter updates.

+1  A: 

It all comes down to the needs of your application. How crucial is it for the information to be up to date? In most cases, I would think that it would be worth the extra query to have real-time data.

I actually recently faced the same challenge on a system I am developing, but ultimately decided that the solution which used a field for storing a tag-count would not work. It might be worth considering the reason I went another way, in case it is applicable to your situation:

With the field-based method, you only have one count available. For my system, I wanted to be able to have several levels of depth available. So, using the tags on this article as an example, I wanted to know more than the overall counts of 'php', 'mysql', 'best-practices', 'performance', and 'tagging'. I also wanted to know the counts of various combinations.

The solution I went with was to use a count(*) as follows:

SELECT count(*)
FROM items i, categories c
WHERE c.Id = i.Id
AND (c.category IN ('php', 'mysql', 'tagging'))
GROUP BY i.Id
HAVING COUNT( i.Id )=3

To mitigate the possibility of this getting slow, I use AJAX to populate the page segments into which the related data is displayed.

JGB146