views:

967

answers:

10

I'm trying to decide on an open source search/indexing technology for a .Net project. It seems like the standard out there for Java projects is Lucene, but as far as .Net is concerned, the Lucene.Net project seems to be pretty inactive. Is this still the best option out there? Or are there other viable alternatives?

+2  A: 

Have a look at www.searcharoo.net. It has a crawler, and features like work stemming, indexing office documents/PDFs. The author is very active on the codeproject articles and responds to questions pretty quickly.

russau
+4  A: 

lucene.net will necessarily lag the java one since it is a port. I also don't like how the lucene port is a straight copy although it does make it easier on the docs I suppose. Something to consider is using Solr if you don't need super tight (binary) integration. I have used it before with good success. It is still powered by Lucene but I think it is better since it has some better features. You can use it from .net via an HTTP endpoint.

One question to ask yourself is what you really need/want in a search solution. There are a lot of ways to go about implementing search and not all solutions work for every situation.

Steve
+2  A: 

Although its not .net i would recommend using Solr as its built on lucene and will be simple to integrate given the fact it returns XML/HTTP and JSON

ADAM
+5  A: 

SQLite has FTS3 (Full Text Search 3) that may do what you want it to do. I don't have direct experience with it, but I believe it was developed explicitly to do what Lucene does, at least in the simple case. I don't believe you can alter the tokenizer or anything (without modifying source code, anyway), but it's an option.

Mark
We use SQLite FTS in our product and it is very good and much faster than Lucene.NET for our specific cases.
Filip Navara
+9  A: 

I know this isn't open-source, but it is a free and very comprehensive offering from Microsoft:

Microsoft Search Server 2008 Express

  • Out-of-the-box relevancy.

    Localized interface.

    Extensible search experience.

    No preset document limits.

    Continuous propagation indexing.

    Out-of-the-box indexing connectors

    Content summaries.

    Hit highlighting.

    Best bets and definitions.

    Query correction.

    Duplicate collapsing.

    Filter by property.

    Filter by language.

    Sort by date.

    E-mail/RSS alerts

Dan Diplo
However, the DB size limit is easily reached if you're going to use this for a search index. It's also not primarily designed for text-indexing, and while text-indexing may work, it'll perform rather poorly compared to something like lucene.
Eamon Nerbonne
Interesting- I didn't know MS did a product like this.
RichardOD
+3  A: 

As I understand, you need "just" a full-text index on your existing database, and SQL Server full-text search in principle worked for you, but your current implementation/setup is too slow.

If I were you, I wouldn't go for a completely different approach (just think about the mess to keep an external index in sync with your database, or join query results from both etc.). Try to fix the performance issue with SQL Server, as nobody would seriously assume that 6sec for searching 7k rows is the final word for a enterprise class solution that is used for some of the largest databases around... Maybe try to ask a new question about common pitfalls with this feature (I'm not an expert on this), and you might end up with a simple fix instead of a complete rebuild of your search architecture ;)

markus
+18  A: 

While they were no 'full blown' releases (i.e. full documentation, web site updates) of Lucene.Net for quite some time, there are still fresh commits to its SVN repository. The latest release (2.3.2) for example was tagged in 07/24/09 (see here). Since the development is still active I would use it for new full-text-search projects.

maayank
I kind of figured this was going to be the answer. Lucene.Net it is then. Thanks everybody!
jamesaharvey
A: 

If you don't really insist on .Net you can give Sphinx a try. Open source and available for all platforms (Windows / Linux).

birger
+3  A: 

Lucene.net is implemented in nHibernate, so if you also are looking for an O/R mapper, the combination may be worth a deeper check.

We currently develop a prototype and configuring Lucene is done in a bunch of minutes (we use fluent nhibernate).

griti
I am giving nHibernate a try as well. Thanks for the info.
jamesaharvey
+3  A: 

After having used Lucene.Net in a couple projects, I'd also like to add the suggestion of compiling the Java version of lucene into .net code with IKVM.NET. It works wonderfully, and you never have to worry about being out-of-date with respect to the Java version. You also have the option of compiling all the extra libraries and using them as well (I'm using the GIS search stuff in one project).

Mark
Have you thought of creating a codeplex project for this? Maybe setup a periodic build
Mikos
+1 for this obvious but easily overlooked option, given Lucene.NET I haven't thought about this myself yet; did you encounter any obstacles that could make this difficult for non Java shops or is your experience with using IKVM for a project that size as smooth as it sounds like?
Steffen Opel
@Mikos - pretty nifty idea; in case this turns out to be feasible with a project the size of Lucene it could be a nice precedence for this approach - or is this approach commonplace already and I've just been missing out?
Steffen Opel
+1 I didn't even know this was available. Thanks.
Chris Lively
@Steffen - it really is that easy. The only problem I've run into was that the new version (3.0.2) of lucene uses a class that isn't available in the current version of IKVM's JVM. I ended up using SimpleFSLockFactory instead of NativeFSLockFactory.
Mark