Hey all. I have a MSSQL 2008 database with a fair number of rows. As of now, before new rows are inserted into the table, the stored procedure checks to see if that record already exists in the database (by checking a column labeled Title). This check is exact, and if the to-be-inserted record is slightly different, it will insert it instead of updating the existing row (which is an approximate match). What I would like to do is somehow detect approximate duplications in the table before inserting. So a new record that is to be inserted:
The quick brown fox jumps over the lazy dog
would approximately match:
Quick brown fox jumps over the lazy dog
if this record exists in the table already. I've seen (and used for other situations) the Levenshtein Distance algorithm implemented in T-SQL, but I'm not sure if this could be applied in my case because a pair of input strings are required to execute the algorithm. How are members of the community handing things of this sort? Thanks.