Hi, I have setup new instance of Solr indexing on a website. I want Solr NOT to index certain URL patterns. Is there any way of mentioning such exclude-pattern?
Regards, Paras
Hi, I have setup new instance of Solr indexing on a website. I want Solr NOT to index certain URL patterns. Is there any way of mentioning such exclude-pattern?
Regards, Paras
It can be done in the program, index only if the pattern does not match the exclude pattern. With solrj and it can be done quite easily with java's regex.
You can do the filtering in Solr using an UpdateRequestProcessor. In that UpdateRequestProcessor, you could decide whether or not to index the document if it matches or not your regex.
Do you have a crawler going about and collecting data? I would lean towards doing that logic in the crawler. Solr is more of respository, and I don't think is the best place to put lots of indexing logic in.
Eric