views:

907

answers:

6

I would like to implement a full text search in an iPhone application. I have data stored in an sqlite database that I access via the Core Data framework. Just using predicates and ORing a bunch of "contains[cd]" phrases for every search word and column does not work well at all.

What have you done that seems to work well?

A: 

I use contains[cd] in my predicate and this works fine. Perhaps you could post your predicate and we could see if there's an obvious fault.

Alex Reynolds
Sorry I meant that it doesn't work well, like performance wise. I'll post the predicate when I get home from work. Thanks!
mswebersd
+1  A: 

I assume that by "does not work well" you mean 'performs badly'. Full-text search is always relatively slow, especially in memory or space constrained environments. You may be able to speed things up by making sure the attributes you're searching against are indexed and using BEGINSWITH[cd] instead of CONTAINS[cd]. My recollection (can't find the cocoa-dev post at this time) is that SQLite will use the index for prefix matching, but falls back to linear search for infix searches.

Barry Wark
Ok that makes sense. I'm going to try that out.
mswebersd
Barry Wark sez "Full-text search is always relatively slow" This is incorrect, perhaps due to misunderstanding what a full text search is? Substring searches that translate into a SQL LIKE statement are slow because every row must be scanned. This is not usually referred to as a "full-text search"A full text search (i.e. making use of a full text index) is often fast, definitely faster than SQL LIKE statements. I'll put more details into an answer below. :)
crunchyt
A: 

Take a look at my answer to this question, there is stuff on improving performance of text/substring searching in Core Data http://stackoverflow.com/questions/1862771/sqlite-indexing-performance-advice/1863315#1863315

lyonanderson
A: 

Sqlite has its own full text indexing module: http://sqlite.org/fts3.html

You have to have full control of the SQL you send to the db (I don't know how Core Data works), but using the full text indexing module is key to speed of execution and simplicity in your SQL SELECT statements that do full text searching.

Using CONTAINS is fine if you don't need fast execution, but selects made with it can't make use of regular indexes so are destined to be slow, and the larger the database the slower it will be. Using real full text indexing allows same sort of searches as you can do with 'CONTAINS', but things are indexed for fast results even with large db's.

Herbert Sitz
A: 

I've been working on this same problem and just got around to following up on my post about this from a few weeks ago. Instead of using CONTAINS, I created a separate entity with an instance for each canonicalized word. I added an index on the words (in XCode model builder) and can then use a BEGINSWITH operator to exploit the index. Nevertheless, as I just posted a few minutes ago, query time is still very slow for even small data sets.

There must be a better way! After all, we see this sort of full text search in lots of apps!

dk
This seems like the way to go. I'll give it a go.
mswebersd
Hey dk, have you had any success in speeding up your search? I did the same thing you did above and it was still pretty slow, as you mentioned.
mswebersd
Weber, I found success by removing the inverse relation, which modified the schema and dramatically sped up searches. But I still haven't tested on a fully loaded DB and am worried about bloat due to new schema. See the comments in the article linked above.
dk
+3  A: 

We have FTS3 working very nicely on 150,000+ records. We are getting subsecond query times returning over 200 results on a single keyword query.

Presently the only way to get Sqlite FTS3 working on the iPhone is to compile your own binary and link it to your project. To my knowledge, the binary included in your own project will not work with Core Data. Perhaps Apple will turn on the FTS3 compiler option in a future release?

You can still link in your own Sqlite FTS3 binary and use it just for full text searches. This would be very similar to the way Sphinx or Lucene are used in Web App environments. Note you will still have to update the search index at some point to keep synchronicity with the Core Data stores.

Good luck !!

crunchyt
Thanks crunchyt, I hope they do enable that option.
mswebersd