solr

Replicating a read-only solr server

I created a solr 1.4 index and would like to serve queries against it for a high-volume application. The index that I am querying is static -- no more updates are allowed. A couple of client apps making requests on the server drive to CPU load to about 200% on a quad-core ubuntu box, so I was thinking of replicating the index on a second...

How to query PDF in Solr?

I added PDF document to Solr curl "http://localhost:8983/solr/update/extract?literal.id=doc2&captureAttr=true&defaultField=text&fmap.div=foo_t&capture=div" -F "[email protected]" and I would like to query it for word "errors" http://localhost:8983/solr/select/?q=errors&version=2.2&start=0&rows=10&indent...

Sync solr documents with database records

I wonder there is a proper way to solr documents with sync database records. I usually have problems: there is solr documents while there are no database records referent by solr. It seems some db records has been deleted, but no trigger has been to update solr. I want to write a rake task to remove documents in solr that run periodicall...

Full text search: Whoosh Vs SOLR

Hi all, I am working on a Django project, where I need to implement full text search. I have seen SOLR and found some good comments for the same. But as its implemented in Java and would need java enviroment to be installed on the system along with Python. Looking for the python equivalent for SOLR, I have seen Whoosh but I am not sure ...

Ranking position in solr

I wonder there is a proper way to fulfill this requirement. A book has several keyphrases. Each keyphrase consists from one word to 3 words. The author could either buy keyphrase position or don't buy position. Note: each author could buy more than 1 keyphrase. The keyphrase search must be exact and case sensitive. For example: Book A, ...

Problem using Wildcard search in Solr

Hi, I am having a problem doing wildcard searches in lucene syntax using the edismax handler. I have Solr 4.0 nightly build from the trunk. A general search like 'computer' returns results but 'com*er' doesn't return any results. Similary, a search like 'co?mput?r' returns no results. The only type of wildcard searches working currren...

ShingleFilter search with more terms than indexed phrase fails

I am using Solr 1.4.1 (lucene 2.9.3) on windows and am trying to understand ShingleFilter. I wrote the following code and find that if I provide more words than the actual phrase indexed in the field, then the search on that field fails i.e. no score contributed from that field with debugQuery=true. Here is an example I created to repro...

distinct SOLR field values without count

Hi, My question is pretty similar to this question The difference, I'd need the least RAM intensive way to gather information about the distinct values. I DON'T care for the actual count in this case, I just want to know the possible values for that field. I'm constantly running out of heap space (30 million+ documents) and there has to ...

how to get the images in Nutch results?

hi, how to get the images in Nutch results? can you please explain it is possible with images? or there is any other open search engine which is producing the results with images? Thanks, Murali ...

Solr - omit certain fields from being highlighted

Hi all I have a Solr engine deployed with a Standard Request Handler <requestHandler name="standard" class="solr.SearchHandler" default="true"> <!-- default values for query parameters --> <lst name="defaults"> <str name="echoParams">explicit</str> <str name="facet">true</str> <str name="facet.field">path</str> <str name...

SOLR - how to do a fuzzy search on booleans

If my index contains three boolean fields: a, b and c... I would like to search for: "a=True, b=False, c=True" and SOLR should return all entries, and their score should represent how good the whole query is matched. e.g. a=T, b=F, c=T, score=1.0 a=T, b=T, c=T, score=0.6 a=T, b=T, c=F, score=0.5 is that possible ? ...

Django Haystack QuerySet Returns Similar Values Back

Hello when I do a Haystack operation with a SOLR backend SearchQuerySet.filter(categories='sean') I get results back from both items that are index with both category types of 'Sean' and 'Sean McCully' but not from anything with value of say 'Jason'. Using exact does not elivate this issue. I am using 1.1 version of Haystack and can ve...

Solr indexing problem

Hello, I am new to Solr. When i index the files, every variable gets indexed, but some are not searchable, how can i stop solr from displaying any results in that case. ...

boost result with sentence matching in solr

Hi, I have a basic solr installation which store articles (title, description, date) If I search for golf club and sort it by date I get every articles with golf or club in the title or description. If I sort it by score I get the one with golf club first. Is there a way to boost those with golf clubs and then get those with either g...

Using AND, OR and NOT in Solr Query

HI, I am trying a solr query which is like this +field1:* AND (field2:1 OR field2:10) NOT(field3:value1 OR field3:value2) But field3 part of the query is not making any impact. It still brings record which has value1 or value2 in field3 Why is this ? ...

Solr's SnowballPorterFilterFactory and Wildcard parameters

Hi, I'm having an issue querying Solr using the following field type: <fieldType name="text_ci" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" wo...

solr schema for multi value fields

HI, In my document i have data as Party value and Party Type. A single document can have multiple Party value and Party Type. For example First document will have Party value as Pramod and Party type as client. The same document will have Party value as XYZ and Party type as supplier. I need to design the schema in such a way that i am a...

How to filter custom content type nodes using ajax in Drupal?

Hello. I'm in a situation where I think I need to create my own custom search module. What I'm trying to do is make a page with a list of all my nodes in the node type - let's call it 'Beer'. So I want to be able to filter through the beers in a fashion similar to the one you find on the Apple Trailers page ( http://trailers.apple.com/ )...

Why are document stores like Lucene / Solr not included in NoSQL conversations?

All of us have come across the recent hype of no-SQL solutions lately. MongoDB, CouchDB, BigTable, Cassandra, and others have been listed as no-SQL options. Here's an example: http://architects.dzone.com/articles/what-nosql-store-should-i-use However, three years ago a co-worker and I were using Lucene.NET as what seem to fit the descr...

i need to implement solr schema with frequently updated field

I'm using Lucene/solr for searching and navigation in file upload application i need to update the indexed value 'downloaded' for each document for each download. the same case happed in digg.com , they have how many "diggs" for each link while u searching does i have to delete/insert new document for each download. or there something...