views:

1205

answers:

4

Lucene is an excellent search engine, but the .NET version is behind the official Java release (latest stable .NET release is 2.0, but the latest Java Lucene version is 2.4, which has more features).

How do you get around this?

+8  A: 

One way I found, which was surprised could work: Create a .NET DLL from a Java .jar file! Using IKVM you can download Lucene, get the .jar file, and run:

ikvmc -target:library <path-to-lucene.jar>

which generates a .NET dll like this: lucene-core-2.4.0.dll

You can then just reference this DLL from your project and you're good to go! There are some java types you will need, so also reference IKVM.OpenJDK.ClassLibrary.dll. Your code might look a bit like this:

        QueryParser parser = new QueryParser("field1", analyzer);
        java.util.Map boosts = new java.util.HashMap();
        boosts.put("field1", new java.lang.Float(1.0));
        boosts.put("field2", new java.lang.Float(10.0));

        MultiFieldQueryParser multiParser = new MultiFieldQueryParser(new string[] { "field1", "field2" }, analyzer, boosts);
        multiParser.setDefaultOperator(QueryParser.Operator.OR);

        Query query = multiParser.parse("ABC");
        Hits hits = isearcher.search(query);

I never knew you could have java to .NET interoperability so easily. The best part is that C# and Java is "almost" source code compatible (where Lucene examples are concerned). Just replace System.out with Console.Writeln :).

=======

Update: When building libraries like the lucene highlighter, make sure you reference the core assembly (else you'll get warnings about missing classes). So the highlighter is built like this:

 ikvmc -target:library lucene-highlighter-2.4.0.jar -r:lucene-core-2.4.0.dll
kurious
This is my first time learning of IKVM. Is performance ok? Because, each instruction in the original java needs to go through TWO layers of VM, right? The IKVM JVM and then the .NET CLR. And search is one thing you'd like to be as fast as possible.
Corey Trager
Good question. In this case, I believe it actually creates a .NET dll that runs directly and is not interpreted. So, lucene-core-2.4.0.dll is running through the CLR. IKVM may have other modes where it's doing on the fly interpretation which could be slow.
kurious
From quick testing, for our dataset, etc. I don't see a performance difference between Lucene.NET and the IKVMC version.
kurious
kurious..How did this work out? Is the performance OK?
Luke101
A: 

Download the source and build it. I did this just last weekend and it was easy. No problem at all. The source is at version 2.3.1.

I'm subscribed to the mailing list and judging from it, Lucene.Net is being developed actively.

Corey Trager
Interesting -- I'd still prefer the latest version (given how easily it can be ported with IKVM) but thanks for the pointer!
kurious
It looks like the latest development version is 2.3, but the latest stable release is 2.0.0.4.
kurious
I'm shipping 2.3 with my app (BugTracker.NET) and so far no complaints.
Corey Trager
+1  A: 

Lucene.net is under development and now has three committers

A: 

Hi,

I converted the Lucene 2.4 from jar to dll through this way but now it gives me an error that 'Type or namespace Lucene could not be found'. I removed the old dll from the project and added reference for the new one. I really want to get rid of the old version as it took around 2 days and in the end during optimization it gave some error and now the index is not updateable :S. I read somewhere that Lucene 2.4 indexing speed is many times faster than the old versions, if I use 2.3.1 from SVN will that be faster too?