Hi,
I'm setting up an environment using Nutch 1.0 + solR 1.4.
In Nutch I configured the subcollection plugin which seems to work nicely. If I search as normal adding fl=* I can see the subcollection field is filled as intented. (something like <str name="subcollection">mysite.com</str>).
My problem is, I would like to be able to sear...
i have a Lucene-Index with following documents:
doc1 := { caldari, jita, shield, planet }
doc2 := { gallente, dodixie, armor, planet }
doc3 := { amarr, laser, armor, planet }
doc4 := { minmatar, rens, space }
doc5 := { jove, space, secret, planet }
so these 5 documents use 14 different terms:
[ caldari, jita, shield, planet, gallente...
Hi,
I have to index a lot documents that contain reference numbers like "aaa.bbb.ddd-fff". The structure can change but it's always some arbitrary numbers or characters combined with "/","-","_" or some other delimiter.
The users want to be able to search for any of the substrings like "aaa" or "ddd" and also for combinations like "aaa...
What are the best practices to configure Zend Lucene to make the search results more relevant?
i have the following fields and document type
productname (Text)
description (Text)
category (Keyword)
Please give some sample codes.
...
Hi
I would like to implement a search functionality within my iPhone app which can search for terms within all the documents in the application.
I believe I cannot use Apache Lucene directly since it is in Java. Can I use Lucy which is a C port of Lucene (not sure if Perl and Ruby would work on it)?
Or is there any other open-source s...
I want to get the offset of one term in the Lucene . How can i get it ?
I vectored my content as
Field.TermVector.WITH_POSITIONS_OFFSETS
Is there any method in Lucene that give me offset of the term in one Document ?
...
Is it possible to modify Lucene 2.2 to add Arabic analyzer and if anyone have done this already where can I get source/jar
...
Hi,
Why DuplicateFilter doesn't work together with other filters? For example, if a little remake of the test DuplicateFilterTest, then the impression that the filter is not applied to other filters and first trims results:
public void testKeepsLastFilter()
throws Throwable {
DuplicateFilter df = new DuplicateFi...
Hello everyone,
I've been asked to do an evaluation of Solr as an alternative for a commercial search engine.
The application now has a very particular way of sorting results using something called "buckets".
I'll try to explain with a bit of details:
In the interface they have 2 fields: "what" and "where".
Both fields are actually ...
Is there a rollback in lucene?
I'm saving & updating database repository & lucene repository simultaneously so that the lucene index & database are in sync..
ex.
CustomerRepository.add(customer);
SupplierRepository.add(supplier);
CustomerLuceneRepository.add(customer);
SupplierLuceneRepository.add(supplier); // If this here fails i...
I am facing the problem of sort Lucene results based on user click log. I would like that more accessed results comes first. Does anyone knows how to configure or implement such property in Lucene or Solr?
Thank you very much.
...
Hey guys, some help here would as always be greatly appreciated.
I'm indexing data from a db using Solr. Each row in the first table, event_titles, can have more than one start date associated with it, contained in the table event_dates. Data-config is as follows;
<entity name="events"
query="select id,title_id,name,summary,descripti...
hey guys, my requirements are pretty similar to this:
Requirements
http://stackoverflow.com/questions/90580/word-frequency-algorithm-for-natural-language-processing
Using Solr
While the answer for that question is excellent, I was wondering if I could make use of all the time I spent getting to know SOLR for my NLP.
I thought of SOL...
Hi All,
I am planing to add search feature in my web application. I am using Struts 2 framwork for the application and the items that will be searched are strored in a Relational database. In order to achieve a full text search engine I have following doubts :
For database based search engine should I use just lucene or some oth...
Hi,
We have a requirement where we need to group our records by a particular field and take the sum of a corresponding numeric field
e.x. select userid, sum(click_count) from user_action group by userid;
We are trying to do this using apache solr and found that there were 2 ways of doing this:
Using the field collapsing feature (htt...
I know I can, during search, specify a "boost factor" to a term as described in http://lucene.apache.org/java/2_4_0/queryparsersyntax.html.
My question is: Can I provide Lucene with a predefined table of relevance?
For instance, I could say that "chair" and "table" are relevant words with a boost factor of 4 and all subsequent searches...
I;m using Lucene.net (2.9.2.2) on a (currently) 70Gig index.. I can do a fairly complicated search and get all the document IDs back in 1 ~ 2 seconds.. But to actually load up all the hits (about 700 thousand in my test queries) takes 5+ minutes.
We aren't using lucene for UI, this is a datastore between processes where we have hundreds...
Hello, I'm trying to figure out the right way to read lucene index only once whilst running the application multiple times, how can I do that in java?
Because indexed data will not change so reading them each time would not be necessary. Can someone explain me the logic of it reading them only once? thank you
UPDATE :
public List ini...
I am using Lucene Highlighter 2.4.1 for my application. I use the highlighter to get the best matching fragments, and display them.
I make a call to a function String[] getFragmentsWithHighlightedTerms(Analyzer analyzer, Query query, String fieldName, String fieldContents, int fragmentsNumber, int fragmentSize). For example :
String te...
I am playing with lucene for a location search off of a city and state, and everything is going pretty well. the query parser fails when i pass it "state:OR" and disreguards "state:or"
Is there a way to tell the searcher/query parser that I am indeed searching for "OR" ?
Thanks.
...