solr

solr vs xapian : which one gived you the more meaningful results?

I am currently using whoosh to dev a website, and I'll need to choose something more powerful once the website will be in production. If anyone of you used both of these engines, which one gave you the most meaningful results one the long road? ...

Liferay 5.1.1 solr plugin ClassCastException

I had Solr 1.2 up and running at port 8983 and using liferay 5.1.1 the question is how to configure solr to search at liferay JournalArticle table I've already installed solr-web plugin for liferay but it throws this exception [SolrIndexSearcherImpl:79] Error while sending request to Solr java.lang.ClassCastException: com.liferay.portal...

Solr - splitting a field value into multiple rows in the index

Is it possible to do so with a transformer (data-fonfig.xml) or in schema.xml somehow? For instance, my database returns rows with two columns: IntegerColumn, VarcharColumn 1, 'small' 2, 'medium' 3, 'small, medium' I'd like to add the 3rd row as two separate entries in the Solr index. 1, 'small' 2, 'medium' 3, 'small' 3, 'medium...

How do I create a solr core with the data from an existing one?

Solr 1.4 Enterprise Search Server recommends doing large updates on a copy of the core, and then swapping it in for the main core. I am following these steps: Create prep core: http://localhost:8983/solr/admin/cores?action=CREATE&name=prep&instanceDir=main Perform index update, then commit/optimize on prep core. Swap main and p...

Solr in a multi-tenant environment

I am considering using Solr in a multi-tenant application and I am wondering if there are any best practices or things I should watch out for? One question in particular is would it make sense to have a Solr Core per tenant. Are there any issues with have a large number of Solr Cores? I am considering use a core per tenant because I cou...

Start Sunspot solr at/on Reboot

Hi guys, I'm trying to start Sunspot's solr server on my ubuntu workstation and server. So far I'm tried in the crontab @reboot sh -c 'cd /<location of rails program>/; rake sunspot:solr:start' and @reboot sunspot-solr start -d <options> -s <options> -p <port-num> Neither are able to start the server at reboot. Thoughts? Thanks, ...

How to run solr on a windows server so it starts up automatically?

How can I run solr on a windows server, so it starts up automatically? I run it with: java -jar start.jar but I need the server to do this automatically. ...

Solr range query for specefic id like /solr/select?q=x:[1,2,5,11,64589]

Hi I have some specific id like 1,2,5,11,64589 in solr (int type) I want to qet query like ttp://localhost:8983/solr/select?q=x:[1,2,5,11,64589] but does not work (get error). how can do it ??? Note: i can implement with "OR" but i want simple way (and other problem limit in max url char length) ...

Highlighting in Solr 1.4 - requireFieldMatch

I have an object Title : foo Summary : foo bar Body : this is a published story about a foo and a bar All three are set up as fields with stored=true. The user searches across my system for the word "foo" I would like to highlight foo in all three places. The user searches for the word foo in the title "title:foo" I o...

Rails Serch Integration

I have been looking at integrating a search engine into rails, and I am considering both Sphinx with the Thinkink Sphinx plugin, or possibly Solr with the acts as solr plugin. My question is which one works best in general? As far as I can see sphinx seems to be the option most people go with to get a basic search up and running. ...

Lucene Fuzzy Match on Phrase instead of Single Word

I'm trying to do a fuzzy match on the Phrase "Grand Prarie" (deliberately misspelled) using Apache Lucene. Part of my issue is that the ~ operator only does fuzzy matches on single word terms and behaves as a proximity match for phrases. Is there a way to do a fuzzy match on a phrase with lucene? ...

Solr query results using *

I want to provide for partial matching, so I am tacking on * to the end of search queries. What I've noticed is that a search query of gatorade will return 12 results whereas gatorade* returns 7. So * seems to be 1 or many as opposed to 0 or many ... how can I achieve this? Am I going about partial matching in Solr all wrong? Thanks. ...

Best Practice of Field Collapsing in SOLR 1.4

I need a way to collapse duplicate (defined in terms of a string field with an id) results in solr. I know that such a feature is comming in the next version (1.5), but I can't wait for that. What would be the best way to remove duplicates using the current stable version 1.4? Given that finding duplicates in my case is really easy (co...

OutOfMemoryError: Java heap space error when start solr

Hi I start indexing DB articles with solr, but after add about 58 million article (and about 113 GB size of disk) , i get below error message on tomcat log error Note1: i already set Init memory pool to 256MB, and Max memory pool:1400MB to tomcat server. Note2: I can post or search article but must wait over 3 min for get response. ...

ASP.NET Lucene Performance Improvements question

I have coded up an ASP.NET website and running on win'08 (remotely hosted). The application queries 11 very large Lucene indexes (each ~100GB). I open IndexSearchers on Page_load() and keep them open for the duration of the user session. My questions: The queries take a ~5 seconds to complete - understandable these are very large inde...

How can I search on a list of values using Solr/Lucene?

Given the following query: (field:value1 OR field:value2 OR field:value3 OR ... OR field:value50) Can this be broken down into something less verbose? Basically I have hundreds of category IDs, and I need to search for items under large groups of category IDs (20-50 at a time). In MySQL, I'd just use field IN(value1, value2, value3)...

Sunspot / Solr full text search - how to index Rails associations

Is it possible to index through an association with Sunspot? For example, if a Customer has_many Contacts, I want a 'searchable' block on my Customer model that indexes the Contact#first_name and Contact#last_name columns for use in searches on Customer. acts_as_solr has an :include option for this. I've simply been combining the asso...

Which metadata I should save when downloading web-pages?

Hi, I'm going to download (for future purposes of language processing) some thousands webpages. Now I'm thinking, which metadata I should save. I explore this, but I do not wont to neglect something important. <title> <link> <publish_date> <date_downloaded> <source> // to this page <keyword> // for Solr indexing <text> // cleaned b...

Solr exact word search

Hi all, I want to configure my Solr search engine so I get an exact match for the search term I enter. eg. 'taxes' should return documents with 'taxes' and not 'tax', 'taxation' etc. Any help or tips would be appreciated Kind Regards Ruth ...

solr JOIN query

I need to run a JOIN query on a solr index. I've got two xmls that I have indexed, person.xml and subject.xml. Person: <doc> <field name="id">P39126</field> <field name="family">Smith</field> <field name="given">John</field> <field name="subject">S1276</field> <field name="subject">S1312</field> </doc> Subject: <doc> <field name="id...