In Lucene, using a Standard Analyzer, I want to make fields with spaces and special characters(underscore,!,@,#,....) searchable.
I set IndexField to NOT_ANALYZED_NO_NORMS and Field.Store.YES
When I look at my index in LUKE, the fields are as I expected, a value such as:
'SKU Number', yet when I search for 'SKU' or 'SKU*' nothing come...
I am looking at the query syntax. and i could not figure out how to search 'and'. I tried "a sentence with and and words after it" i tried +and and \and. It always ignored it. How can i search 'and'? I am using lucene.net
...
On a new project I need a hard use of lucene for a searcher implementation. This searcher will be a very important (and big) piece of the project. Is valid or convenient replacing Relational Database + Lucene with MongoDb?
edit: Ok, I will clarify: I'm not asking about risk, I can pay that price in this project. My point is: Is MongoDB ...
I have an object
Title : foo
Summary : foo bar
Body : this is a published story about a foo and a bar
All three are set up as fields with stored=true.
The user searches across my system for the word
"foo"
I would like to highlight foo in all three places.
The user searches for the word foo in the title
"title:foo"
I o...
Good day
The question is: Could anyone give me an example about how to do fuzzy matching of two strings using Lucene.NET (or using Java version of Lucene, or in any other language that has port of Lucene).
...
Hello,
My question in a nutshell: Does anyone know of a TwitterAnalyzer or TwitterTokenizer for Lucene?
More detailed version:
I want to index a number of tweets in Lucene and keep the terms like @user or #hashtag intact. StandardTokenizer does not work because it discards the punctuation (but it does other useful stuff like keeping d...
Hi All, im doing an aplication with Lucene (im a noob with it) and im facing some problems.
My aplication uses the Lucene 2.4.0 library with a custom similaraty implementation (the jar is imported)
In my app im calculating doqFreq and numDocs manually (im adding the values of all indexes and then i calculate a global value in order to u...
We're running into a serious bug with the Lucene.NET 2.3 codebase. We're upgrading to Lucene 2.9 in hopes the bug is fixed.
Upgrading to the latest version, we see that the MultiFieldQueryParser contructor is [Obsolete]:
[Obsolete("Use the ctor with Version param instead.")]
public MultiFieldQueryParser(string[] fields, Analyzer analyz...
Also I want to know how to add meta data while indexing so that i can boost some parameters
...
-- I don't want to start any religious wars, but a quick google search indicates that Apache Lucene is the preferred open source tool for indexing and searching. Are there others?
-- What file format does Lucene use to store its index file(s)?
Thank is advance.
Doug
...
I have a text file containing posts in English/Italian. I would like to read the posts into a data matrix so that each row represents a post and each column a word. The cells in the matrix are the counts of how many times each word appears in the post. The dictionary should consist of all the words in the whole file or a non exhaustive E...
I'm trying to do a fuzzy match on the Phrase "Grand Prarie" (deliberately misspelled) using Apache Lucene. Part of my issue is that the ~ operator only does fuzzy matches on single word terms and behaves as a proximity match for phrases.
Is there a way to do a fuzzy match on a phrase with lucene?
...
Hi,
I'm playing around with Lucene and noticed that the use of a hyphen (e.g. "semi-final") will result in two words ("semi" and "final" in the index. How is this supposed to match if the users searches for "semifinal", in one word?
Edit: I'm just playing around with the StandardTokenizer class actually, maybe that is why? Am I missi...
Hello,
I read some document about Lucene; also I read the document in this link
(http://lucene.sourceforge.net/talks/pisa).
I don't really understand how Lucene indexes documents and don't understand which algorithms Lucene uses for indexing?
On the above link, it says Lucene uses this algorithm for indexing:
incremental algorithm:
...
I have a bunch of int key fields in my index and trying to do a simple range search like this:
`gender:1 AND height:[120 TO 180]`
This should give me male in the height range 120 to 180. But for some reason i get this exception:
`At least one range query boundary term must be non-empty term`
How would i debug this? Is it just Zend_...
I have coded up an ASP.NET website and running on win'08 (remotely hosted). The application queries 11 very large Lucene indexes (each ~100GB). I open IndexSearchers on Page_load() and keep them open for the duration of the user session.
My questions:
The queries take a ~5 seconds to complete - understandable these are very large inde...
I want to know , What is the advantage of Lucene searching and indexing ?
Is searching with Lucene as fast as other searching algorithm like Quick Search?
What about indexing ?
I want to know more about advantage of Lucene rather that others .
thanks .
...
I want to know why Lucene merge indexes ?
It's better to say , why does not Lucene merge all indexes to one index ? What is the advantage of this merging method ?
...
Given the following query:
(field:value1 OR field:value2 OR field:value3 OR ... OR field:value50)
Can this be broken down into something less verbose? Basically I have hundreds of category IDs, and I need to search for items under large groups of category IDs (20-50 at a time). In MySQL, I'd just use field IN(value1, value2, value3)...
Basically i am doing this
I think i'll set the document id as the thread id on my site (even if some types of thread wont be searched). So i can search by thread id but i am clue less of how to delete. I found pages that say use the document index and i need to optimize or close before changes take effect but i dont know how to get the ...