tags:

views:

110

answers:

1

Hi,

I've set up my first 'installation' of Solr, where each index (document) represents a musical work (with properties like number (int), title (string), version (string), composers (string) and keywords (string)). I've set the field 'title' as the default search field.

However, what do I do when I would like to do a query on all fields? I'd like to give users the opportunity to search in all fields, and as far as I've understood there is at least two options for this:

(1) Specify which fields the query should be made against.

(2) Set up the Solr configuration with copyfields, so that values added to each of the fields will be copied to a 'catch-all'-like field which can be used for searching. However, in this case, i am uncertain how things would turn out when i take into consideration that the data types are not all the same for the various fields (the various fields will to a lesser og greater degree go through filters, but as copyfield values are taken from their original fields before the values have been run through their original fields' filters, i would have to apply one single filter to all values on the copyfield. This, again, would result in integers being 'filtered' just as strings would).

Is this a case where i should use copyfields? At first glance, it seems a bit more 'flexible' to rather just search on all fields. However, maybe there's a cost?

All feedback appreciated! Thanks!

A: 

When doing a copy field, the data within the destination field will be indexed using the analyzer defined for that field. So if you define the destination field to be textual data, it is best to only copy textual data in it. So yes, copying an integer in the same field probably does not make sense. But do you really want the user to be able to search for your "number" field in a default search? It makes sense for the title, the composer and the keyword, but maybe not for the integer field that probably represents id in your database.

Another option to query on all fields is to use Dismax. You can specify exactly which fields you want to query, but also defined specific boots for each of them. You can also defined a default sort, add extra boost for more recent documents and many other fancy stuff.

Pascal Dimassimo
Thank you for your answer. I do however wonder if it has any consequences (for instance, performance-wise) to search on several fields instead of searching a single field based on copied field values (copy-field)?
sbrattla
Performance-wise, I do not think there is a significant difference between searching in different field VS searching many times in the same field. Lucene has to do a binary search for each terms of the query. Whether it does n searches on the same field or n searches on different fields should not be different.
Pascal Dimassimo
Thanks for your input!
sbrattla