views:

582

answers:

2

Assume I have a database table with many names. I'd like to "flex match" against these names. I'm not sure if "flex match" is the proper term to use, but let's go with that for now. There have been similar discussions on "fuzzy matching," but I'm not really interested in phonetic matching. I'm interested in what I'd call ordered-subset-matching.

I would like it to work akin to QuickSilver (OSX) or Launchy (Windows). Here are a few examples of matches for a given search string:

mitMassachusetts Institute of Technology
ffoxFire**fox**
osx ⇒ Mac OS X
msMicro**s**oft Corporation

My end goal is to have a web page with an auto-completing text field that's data driven from the server.

I'm confident I'll get adequate results on the client side by combining features from jQuery LiveUpdate and/or jQuery QuickSelect.

Where I need help is in how to best handle the flex match on the server side against a large table. I have some ideas in how to build my own custom index using the Quicksilver scoring algorithm and maybe some permutation index logic, but I'd rather not re-invent the wheel if something else if readily available.

In summary: What is the best way to gain a fast flex match against a database table with many rows?

+1  A: 

One method would be to just do LIKE matches. Put a % in between each character, and then before and after the string, and search based on that. Obviously, that will pull in other things for ms like 'multimedia systems', but you could probably pair that with another table that contains 'suggested' matches, and sort by those as well.

gms8994
Yeah, I gave that some thought, but it certainly doesn't scale well for large tables. It would require a full table scan. It also doesn't score the results. Sequential characters and initials of words should score higher than random subsets.
Ryan McGeary
+2  A: 

This doesn't answer my question directly, but for the project I'm working on, I realized that I just didn't need a server side component for this yet. To facilitate the client side of my web application, I just launched two new open source projects:

  • LiquidMetal: This is a Quicksilver-like scoring algorithm that scores strings against abbreviations. Useful when building an index.
  • Flexselect: a jQuery plugin that turns select boxes into flex-matching incremental-finding controls. Think of it as Quicksilver squooshed into a select box. It uses LiquidMetal to filter and sort the live results.
Ryan McGeary
+1 for Flexselect.
FreshCode