views:

3742

answers:

4

SSIS 2005/2008 does fuzzy lookups and groupings. Is there a feature that does the same in T-SQL?

+2  A: 

SQL Server has a SOUNDEX() function:

SELECT * 
FROM Customers
WHERE SOUNDEX(Lastname) = SOUNDEX('Stonehouse')
AND SOUNDEX(Firstname) = SOUNDEX('Scott')
Ian Boyd
True - forgot about this. But don't think it will work for numeric identifiers, say with transposed digits or something.
ScottStonehouse
I have found soundex to be very primitive in the past, it works but it's granularity is very coarse. Double Metaphone a newer version of phonetic matching, but it is still limiting.
vfilby
+1  A: 

Full Text Search is a great fuzzy tool. Brief primer here

Josef
+2  A: 

Fuzzy lookup uses a q-gram approach, by breaking strings up into tiny sub-strings and indexing them. You can then then search input by breaking it up into equally sized strings. You can inspect the format of their index and write a CLR function to use the same style of index but you might be talking about a fair chunk of work.

It is actually quite interesting how they did it, very simple yet provides very robust matching and is very configurable.

From that I recall of the index when I last looked at it, each q-gram or substring is stored in a row in an table (the index). That row contains an nvarchar column (among other values) that is used as binary data and contains references to the rows that match.

There is also an open feedback suggestion on Microsoft Connect for this feature.

vfilby
+1  A: 

On March 5 2009 I will have an article posted on www.sqlservercentral.com with a sample of Jaro-Winkler TSQL

http://www.sqlservercentral.com/articles/65702/
Brian
Ug. SQLServerCentral.com is almost as bad as experts-exchange. You *can* view the articles, but you have to know to change your user-agent to impersonate Google Bot.
mattmc3