Let's say I want to index my shop using Solr Lucene.
I have many types of entities : Products, Product Reviews, Articles
How do I get my Lucene to index those types, but each type with different Schema ?
Let's say I want to index my shop using Solr Lucene.
I have many types of entities : Products, Product Reviews, Articles
How do I get my Lucene to index those types, but each type with different Schema ?
With Lucene/Solr, each document does not need to set a value for each field. Within the same schema, you can have a set of fields for entity A and another set of fields for entity B and just populate the appropriate field depending on the entity.
With Solr, you also have the option to go multi-core. Each core have its own schema. You could define a core for each entity.
You might want to have 3 indexes called Products, ProductReviews and Articles. Each index can have its own schema. The difference between Lucene and a relational db approach is that a row in a db, roughly translates to a document in Lucene. Note: each document can have its own schema (which is another difference from a relational db).
I recommend creating your index in a way that all of you entities have more or less the same basic fields: title, content, url, uuid, entity_type, entity_sourcename
etc. If each of your entities has a unique set of corresponding index field, you'll have hard time constructing query to search all entities simultaneously, and your results view may become a huge mess. If you need some specific fields for a specific entity, then add it and perform special logic for this entity based on its entity_type.
I'm speaking from experience: we're managing an index with over 10 different entities and this approach works like charm.
P.S. A few other simple advices.
Multi-core is an approach to use with care. With a simple schema like yours, it's a better way to do as buru recommands. That means to find common fields between your different entities, and then fields that will be used only by on or several of them. You can then add a field "type" or "type_id" which will say if your entity is product, a product review...
Doing so will enable you to have an unique index, and to process queries fastly.