solr

Is it possible to have SOLR MoreLikeThis use different fields for model and matches?

Let's say I have documents with two fields, A and B. I'd like to use SOLR's MoreLikeThis, but with a twist: I'm most interested in bossting documents whose A field is like my model document's B field. I don't see a way to use the mlt.fl fields or mlt.qf boosts to achieve this effect in a single query. Am I missing some option? Or wi...

nutch and sitemap.xml

does apache-nutch support sitemaps? or how can i implement it myself? how can i use priority field, should it be multiplied to boost field? ...

change facet field in drupals apache solr module

Hey! I have installed the apache solr module in drupal. I have the search page returning the facets. What I need to do is to change html given by the module of one of the facets. I want to change it from checkboxes to sliders (using jquery) I know how to create the sliders. What I don't know is how to change the html generated for the...

How To Search Domain Objects And The Physical Files They Point To Using Solr Or Searchable

I have a digital library system where I store metadata and the path to physical file in the database. The files may be anything: plain text,Word,PDF,MP3,JPEG,MP4... How can I provide full text search to both my domain objects and the physical files (or some text extraction of the files). Is my only choice to store the document text i...

Problem with indexing using StreamingUpdateSolrServer in SOLRJ

I just had a miserable failure with SOLRJ. Somehow StreamingUpdateSolrServer failed on some of the items that are being indexed, but others succeeded. It simply throws out an Exception with "Bad Request" message, without any further explanation or stack trace. I suspect that this is due to malformed data, but after double checking, I'm a...

Solr highlighting of multiple terms

I have configured Solr so that the terms I'm searching are highlighted, but if those terms are far between them, I will only see the first one in the highlighting snippet. What I want is to have something similar to Google's: separating snippets with an "ellipse" (...) so I can see the multiple terms in their context at once. Is it pos...

Best way to keep index real time?

Hi All I have solr/lucene index file of say 700GB, now the documents that i need to index are coming in real time say in half an hour 1000 docs are submitted and need to be indexed. now in my scenario an executable run after every 30 mins and index the documents that are not yet indexed, because it is requirement that the new documents ...

Using Solr CELL's ExtractingRequestHandler to index/extract files from package formats.

Can you use ExtractingRequestHandler and Tika with any of the compressed file formats (zip, tar, gz, etc) to extract the content out for indexing? I am sending solr the archived.tar file using curl. curl " http://localhost:8983/solr/update/extract?literal.id=doc1&fmap.content=body_texts&commit=true" -H 'Content-type:application/...

Change schema in Solr without reindex

First of all, sorry about my english: In Solr, if we have a field in the schema with stored="true" and we change the analyzer associated with that field, are any posibility of update just this field without reindex all the documents ? Using the "stored" values of the field with the new analyzer without any datasource or external data. ...

SOLR - How to have facet counts restricted to rows returned in resultset

/select/?q=*:*&rows=100&facet=on&facet.field=category Say I have around a lakh documents indexed. But i return only 100 documents using rows=100. The facet counts returned for category, however return the counts for all documents indexed. Can we somehow restrict the facets to the resultset returned? i.e 100 rows only? ...

Url Encoding problem in Java and Solr

I am working in Solr and making some filter quires. One of my filter is consists of a space for eg:- "fq=listing_type:New home" But this is giving error. No result is comming out. I also tried "fq=Listing_type:New+home" This was not giving error. But no results are comming out. Event there is some xml which have thse values. Can anyon...

Best practice for ensuring Solr/Lucene index is "up to date" after long rebuild

Hi all, We have a general question about best practice/programming during a long index rebuild. This question is not "solr specific" could just as well apply to raw Lucene or any other similar indexing tool/library/black box. The question What is the best practice for ensuring Solr/Lucene index is "absolutely up to date" after long i...

Solr statistical information

Is that possible to get some kind of stats from solr. E.g. Most frequently used words (unigrams), or phrases (bi- trigrams)? ...

How to improve search results with QueryElevationComponent?

I'm using solr 1.4 and using QueryElevation Component for guaranteed search position. I have around 700,000 documents with 1 Mb elevation file. It turns out it is quite slow on the newrelic monitoring website: Slowest Components Count Exclusive Total QueryElevationComponent ...

Haystack more_like_this returns all

I am using Django, haystack, solr, to do searching. Ive am able to search and now I would like to find similar items using more_like_this. When I try to use the more_like_this functionality I get back all of the objects that are of that model type instead of just the ones that closely match it. Here is some code to show you how I am usi...

create new core directories in SOLR on the fly!

i am using solr 1.4.1 for building a distributed search engine, but i dont want to use only one index file - i want to create new core "index"-directories on the fly in my java code. i found following rest api to create new cores using an EXISTING core directory (http://wiki.apache.org/solr/CoreAdmin). http://localhost:8983/solr/admin/...