views:

96

answers:

2

Hi,

I m developping a simple dictionnary word app in french with 150k words and definitions. i m looking for the best way to do this.

First i use a sqlite bdd with 150k words. i use the LIKE command for word search but it is very slow ex : SELECT * FROM words WHERE word LIKE '%avoi%' LIMIT 0,50; for searching word who contain 'avoi' like 'avoir' or 'savoir'. my table have the word column indexes but LIKE doesn't use index so it is very slow (2-5)s on 3GS.

After i use fts3 extension off sqlite for use MATCH command ex : SELECT * FROM words WHERE word MATCH 'avoi*' LIMIT 0,50; much better (0,1-0,15s) on 3GS but it s only search for word that begin with 'avoi' word like 'savoir' is not in the result. MATCH command doesn't work with syntax like 'avoi'

Have you any ideas for optimize this text search ?

I have a very good exemple of iphone app : Dixel (Robert Disctionnary) who make this kind of search very fast. Any ideas for the method ?

thanks for answers.

+2  A: 

Fast dictionaries use complex data structures to limit brute force searching. There is a lot of data about words that can be stored and searched quickly

One such data structure is simply an ordering of words based on the relationships between letters they contain. E.g. you have a table that list all words in which a is followed by a v. Then another for all words that have a v followed by an o. Searching for an arbitrary string avothen becomes a matter merging the tables with sequenced AND. So:

(all words in which `a` is followed by a `v`) AND (all words in which `v` followed by an `o`)

Once you get the table of all words that match have the necessary pattern, you can brute force it quickly.

Dictionaries are like Dates and Times, they seem simple because we are used to them but behind the scenes code needed to make them work on computers is deceptively complex.

TechZen
thanks for the answer... I have no doubts about the difficulty of modeling a bdd for this kind of search.I'm trying to put the indexes words in NSArray... search become more fast (about 0,5s) but i don't think it's a good way to do that (3 Mo Memory for the NSArray...).Did you have some good link or good read which can help me ? i'm a little loss and i don't know where and who to search...Thanks
rubijn
I don't have any links. This is kind of specialized stuff and general data structures is a big topic. I would suggest looking at some of the open source dictionaries floating around. Look at how they handle the issue. I can tell you that your going to have to do a lot of processing on the database to index it six ways to sunday.
TechZen
A: 

The fastest way to search lots of data is a database like SQLite3. The reason databases are fast is their organization of tables. I'm guessing that when you tried the SQLite database, you had one table with every word in it. To get a faster speed, you would need to have a table with every length of word, then within each have 26 tables with the first letter of every word, or any other way they could be organized.

NSArray
If i use FTS3 my search is very speed 0.15s max. So i don't have to cut the BDD. But my real problem is to find part of word....let say i'm looking for 'book' i want to retreive 'booking' but i want 'rebooking' to... So your method doesn't work. If i have one table by letter and / or word length i could never search for word endind by 'ook' or who contain 'ook'.
rubijn