tags:

views:

46

answers:

1

I have a document structure in Solr that looks something like this (irrelevant fields excluded):

<field name="review_id" type="int" indexed="true" stored="true"/>
<field name="product_id" type="int" indexed="true" stored="true"/>
<field name="product_category" type="string" indexed="true" stored="true" multiValued="true"/>

product_id here is one-to-many wrt review_id

I can get a faceted count of reviews in each category by doing:

/select?q=*:*&rows=0&facet=true&facet.field=product_category

I want to be able to do faceting on the product_category, but get the number of distinct product_id:s instead of the number of review_id:s. Is this possible to do in Solr?

A: 

There is no one-to-many in a Solr index. It's not a relational database. The index is either about reviews or about products, and that depends on what you'll be searching for. To quote the Solr wiki about schema design:

Solr provides one table. Storing a set database tables in an index generally requires denormalizing some of the tables. Attempts to avoid denormalizing usually fail.

So the first step is fixing the schema design. Only after that (and always keeping the fact above in mind) can you design facets and other stuff.

Mauricio Scheffer
I know you can't have one-to-many in the index, but I don't understand why that makes my question wrong. The documents in this case is the reviews (the product data is not in the index), but I still want to group by the product_id:s
Joakim Lundborg
@Joakim: as with relational databases, if your schema is ill-defined, some queries are more complex than they should be. I really recommend starting with a proper schema design and playing by Solr's rules.
Mauricio Scheffer
@Joakim: nope. Use the power of multiValued fields and dynamic fields.
Mauricio Scheffer
I understand this would be a more complex query (two aggregation steps), but I still wonder if it can be done in solr, or if I have to build two indexes, one for products and one for reviews in this case.
Joakim Lundborg
@Mauricio: could you be a bit more specific on how this would solve this scenario?
Joakim Lundborg
@Joakim: sorry, no, I won't design your schema for you. This is not a trivial requirement, but I believe it can be done in a single index. Multiple indexes complicate things *a lot*.
Mauricio Scheffer