If you look at the comment here you'll see
Lucene is very much the tool to do
this. If you want apple and apples
(plural) to match, you just need to be
careful about using the correct
language stemmer when indexing and
querying the index.
I'm new to lucene and barley understand how adding and saving document work.
How do...
Hi All,
I have implemented lucene for my application and it works very well unless you have introduced something like japanese characters.
The problem is that if I have japanese string こんにちは、このバイネイです and I search with こ that is the first character than it works well whereas if I use more than one japanese character(こんにち)in search token...
Hi,
I have a Terabyte of data, maybe more, which I'd like to index and search with Lucene. I'd like to be able to split the index out to different machines, similar to what Solr does (if I understand Solr correctly).
Are there any existing tools to do this on the Windows platform?
Thanks!
Edit: I'm not very keen on running Java Luce...
Facing slow search performance using Lucene.Net (+ NHibernate.Search but that doesn't matter).
Luke toolbox overview:
Number of fields: 33
Number of documents: 5607
Number of terms: 101377
Has deletions? / Optimized?: Yes (97478) / No
Index directory is ~200Mb large.
Query (using org.apache.lucene.analysis.SimpleAnalyzer)...
In the latest version of Lucene (or Lucene.NET), what is the proper way to get the search results back in sorted order?
I have a document like this:
var document = new Lucene.Document();
document.AddField("Text", "foobar");
document.AddField("CreationDate", DateTime.Now.Ticks.ToString()); // store the date as an int
indexWriter.AddDoc...
Lucene.Net -
Is there a way to query for documents that contain a particular field.
Lets say some of my documents have a field 'foo' and some do not.
I want to find all documents that have the field 'foo' - regardless of what the value of foo is.
How do I do this? Is it some sort of TermQuery?
...
Hi, i´m development a suggest box for my site search service. I has to search fields like these:
Visual Basic Enterprise Edition
Visual C++
Visual J++
My code is:
Directory dir = Lucene.Net.Store.FSDirectory.GetDirectory("Index", false);
IndexSearcher searcher = new Lucene.Net.Search.IndexSearcher( dir,true);
Term term = n...
Let's say I have 2 instance of a class called 'Animal'.
Animal has 3 fields: Name, Age, and Type
The name field is nullable, so before I insert an instance of Animal as a Lucene indexed document, I check if Animal.Name == null, and if it does, I do not insert it as a field in my document. If I were to retrieve all animals, I would see ...
I am using Lucene.Net 2.0 to index some fields from a database table. One of the fields is a 'Name' field which allows special characters. When I perform a search, it does not find my document that contains a term with special characters.
I index my field as such:
Directory DALDirectory = FSDirectory.GetDirectory(@"C:\Indexes\Name", fa...
I have a Lucene index that has several documents in it. Each document has multiple fields such as:
Id
Project
Name
Description
The Id field will be a unique identifier such as a GUID, Project is a user's ProjectID and a user can only view documents for their project, and Name and Description contain text that can have special characte...
I have a value I am trying to index that looks like this:
Test (Test)
Using a StandardAnalyzer, I attempted to add it to my document using:
Field.Store.YES, Field.Index.TOKENIZED
When I do a search with the value of 'Test (Test)' my QueryParser generates the following tags:
+Name:test +Name:test
This operates as I expect because...
I have this simple Lucene search code (Modified from http://www.lucenetutorial.com/lucene-in-5-minutes.html)
class Program
{
static void Main(string[] args)
{
StandardAnalyzer analyzer = new StandardAnalyzer();
Directory index = new RAMDirectory();
IndexWriter w = new IndexWriter...
Is there a pre-existing library to extract plain text form Open XML file formats (e.g. docx, pptx, and xlsx) files?
I require this to populate a lucene.net index.
I've found this example which extracts text from docx and it seems to work okay. But before building my own solution based on this I was wondering if there's something alread...
Our product consists of multiple applications, All using Lucene. 2 of the applications I am involved with have Lucene indexes of about 3 GB and 12GB. Another team is building an application, for which they estimate the LUCENE INDEX size to be close to 1 Terabyte. New documents are added to the indexes every 15 days approx. We do not have...
Hi
I have a large index, on which Highlighter.Net works fine, but FastVectorHighlighter returns null as a Best Fragment on Some documents.
the searcher works fine. It is just the highlighter. The field has been indexed in the same manner for all documents, so I fail to understand Why it highlights some documents but not all.
Using Lu...
I'm getting started with Lucene.Net (stuck on version 2.3.1). I add sample documents with this:
Dim indexWriter = New IndexWriter(indexDir, New Standard.StandardAnalyzer(), True)
Dim doc = Document()
doc.Add(New Field("Title", "foo", Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.NO))
doc.Add(New Field("Date",...
In Highlighter.Net, we can use NullFragmenter to return the entire field content. Is there any way we can do this in FastVectorHighlighter.Net?
...
What is the proper usage pattern for LINQ to Lucene's Index<T>?
It implements IDisposible so I figured wrapping it in a using statement would make the most sense:
IEnumerable<MyDocument> documents = null;
using (Index<MyDocument> index = new Index<MyDocument>(new System.IO.DirectoryInfo(IndexRootPath)))
{
documents = index.Where(d...
"My search returns a highlighted fragment from a field. I want to know that in that field of particular searched document, where does that fragment starts and ends ?"
for instance.
consider i am searching "highlighted fragment" in above lines (consider the above para as single document).
I am setting my fragmenter as :
SimpleFragm...
I am using the Lucene.NET API directly in my ASP.NET/C# web application. When I search using a wildcard, like "fuc*", the highlighter doesn't highlight anything, but when I search for the whole word, like "fuchsia", it highlights fine. Does Lucene have the ability to highlight using the same logic it used to match with?
Various maybe-...