I am trying to implement a Faceted search or tagging with multiple-tag filtering. In the faceted navigation, only not-empty categories are displayed and the number of items in the category that are also matching already applied criteria is presented in parenthesis.
I can get all items having assigned categories using INNER JOINs and get number of items in all category using COUNT and GROUP BY, however I'm not sure how it will scale to millions of objects and thousands of tags. Especially the counting.
I know that there are some not-relational solutions like Lucene + SOLR, but I've found also some closed-source RDBMS-based implementations that are said to be entreprise-strength like FacetMap.com or Endeca software, so there must be an efficient way to perform faceted search in relational databases.
Does anybody have experience in faceted search and could give some tips?
Cache the counts for each category set? Maybe use some smart incremental technique that will update the counters?
Edit:
An example of faceted navigation can be found here: Flamenco.
Currently I have the standard 3-table scheme (items, tags and items_tags like described here: http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html#toxi ) plus a table for facets. Each tag has assigned a facet.