views:

849

answers:

2

SQLite3 appears to come with three different full-text search engines, called FTS1, FTS2, and FTS3. The documentation available on the website mentions that FTS1 is stable, FTS2 is in development, and that you should use FTS2. Examples I find online use FTS3, which is in CVS, and not documented versus FTS2. None of the full-text search engines come with the amalgamated source, as near as I can tell.

So, my question: which of these three engines, if any, should I use for full-text indexing in SQLite? Or should I simply use a third-party tool like Sphinx, or a custom solution in Lucene, instead?

+4  A: 

I've looked into full-text solutions recently too. It seems like SQLite has no de facto choice right now. No matter what you choose, it's inevitable that you'll have to re-architect it as the various FT2, FT3, etc. solutions mature. So bite the bullet and assume you'll need to do more development in the future to keep pace with changing full-text technology.

Sphinx Search has no direct support for SQLite yet. It supports only MySQL and PostgreSQL right now (ca. August 2009). So you'd have to hack your own SQLite connector or else migrate SQLite data to MySQL or PostgreSQL and then index the data with Sphinx Search. I think someone is working on a Sphinx Search patch to support Firebird, so maybe it's not so hard if you're willing to roll up your sleeves.

Also be aware that Sphinx Search has some limitations about incrementally adding data to the index. You should spend an hour or so reading the doc before you decide to use it.

I don't know of any direct way to index SQLite data in Lucene either. You'd probably have to write your own code to process batches of SQLite data, adding rows to the Lucene index one at a time. This seems to be the usage of Lucene no matter what the database.


update: Solr is a great companion technology for Lucene. Solr gives that search engine many features, including the ability to bulk-load query result data from any JDBC data source.

Bill Karwin
I think sqlite3 fts3 is preferable to any external solution. Fewer moving parts. Switching from fts2 to fts3 can't be as hard as integrating Sphinx with its update limitations or other external solutions.
Seun Osewa
@Seun: I agree--somewhat. It's true that Sphinx Search required installing yet another piece of software, whereas fts3 is built in to SQLite. However, fts3 is not built into default builds, so you have to recompile SQLite yourself. And fts3 is a *virtual table* type, you have to do some manual copying of data from the source table to the fts3 table regardless. It's got some moving parts no matter how you do it.
Bill Karwin
+2  A: 

As of 3.6.21, FTS3 is well documented, and gained a more officially visible status.

FTS3 is part of the standard sqlite DLL build on Windows, not sure about the amalgamated source.

We've been using it on production for about a year with no particular issues.

Eric Grange