views:

13

answers:

2

We have a requirement to tag internal assets (code artifacts ...) which are found within the intranet, so that it can be searched with ease later on.

We would like to show the count of assets tagged by a certain tag name (much similar to http://stackoverflow.com/tags). I would like to understand if this count is measured dynamically when a user visits this page or is it rendered from a tag_count table which contains information about the tag and the number of entries associated with it (Note: the assumption is that this tag_count table is updated on a periodic basis with the number of tags associated with an asset asynchronously).

What would be the preferred approach to solve the above problem, if the number of assets is in the range of 10's of thousands?

+1  A: 

There is another option here...instead of a tag count table, a simple query that gets the tags/count and whatever application you're using to display the data caches that result for some period of time before re-fetching counts. Is this an option? In a web scenario this is the far more common approach.

Nick Craver
We are using Hibernate as the ORM layer and EhCache as our 2nd level cache. This will help from a caching perspective but prioritizing the count (i.e. all entries with the highest count get listed first) will be a problem, unless it is stored in a separate table
Joshua
Unless you're dealing with hundreds of thousands of tags, an in memory sort of the results should be very quick (e.g. just populate/cache a `SortedList<int, string>`, is this an option?)
Nick Craver
Yes, I will go with this approach
Joshua
+1  A: 

I'd say: calculating it at the time of the request is the prefered way, because it is the exact/correct number.

BUT

It is likely that you will hit performance constraints with this approach. In this case using a seperate table is fine.

Depending on the rdbms in use you might have the option of materialized views, which might be able to give you the excact/correct result with acceptable performance.

Jens Schauder
We use h2 in the dev environment and are planning to use either MySQL or Postgres for production. Are materialized views or something equivalent (http://en.wikipedia.org/wiki/Materialized_view) supported in all of the above databases. I would like to avoid any solution which might make it very database specific.
Joshua