ansaurus

Question

Matching an approximate string in a Core Data store

Answer 1

+1 A:

You want your search to be diacritic insensitive to match the 'é' in pensée and 'e' in pensee. You get this by adding the [d] after the attribute. Like so:

    NSPredicate *predicate = [NSPredicate predicateWithFormat:@"(songTitle like[cd] %@)", yourSongSubstring];

The 'c' in [cd] is for case insensitivity.

Since your string could appear in any order in the string you are searching, you could tokenize your search string ([... componentsByString:@" "]) then create a predicate like

    NSPredicate *predicate = [NSPredicate predicateWithFormat:@"(songTitle like[cd] %@) and (songTitle like[cd] %@)", songToken1, songToken2];

That syntax to combine predicates above may be off, going from memory.

baalexander 2009-05-19 21:00:46

Well, I first tried a variation of this and when I parse real world data, it doesn't quite work. Most of the time, the problem is not is the diacritics or case but in subtlely spelled differences (as in "Backstreet girl" vs "Back Street Girl"). This solution is also heavily depend on the previous step, tokenization, which is really hard for the domain "words that could appear in a song title"

damdamdam 2009-05-21 05:46:38

Answer 2

A:

I believe the tool you want to use here is SearchKit. I say that as if I've just made your job easy.... I haven't, but it should have the tools you need to be successful here. LNC is still offering their SearchKit Podcast for free (very nice).

Each track would be a document in this case, and you'd need to come up with a good way to index them with an identifier that can be used to find them. You can then load them up with metadata, and search them. Perhaps putting the title "in" the document would be helpful here to facilitate the use of Similarity Searching (kSKSearchOptionFindSimilar). That may or may not work really well.

The question you've asked is a good one, but there is certainly no industry standard for it because anyone who solves this problem well (i.e. every major search engine) keeps their algorithms very secret. This is a hard problem; no one is quite ready to give away their answer.

Rob Napier 2009-05-19 21:36:13

SearchKit. I completely forgotten about this API. I looked very hard at the doc, I saw immediate uses in my app for it, but I think it's way too involved just to appoximate a match between a string and an other string.

damdamdam 2009-05-21 05:49:34

ansaurus

tags:

views:

answers:

Matching an approximate string in a Core Data store

related questions