I feel like a bit of a newbie posting this, but anyway:
I have a large number of stock items, 3000-5000 which have complex names, based on whoever has entered the items over a period of 16 years, and example of a name is:
"Food, Dog, Pal Meaty Bites chunks 8kg bag"
Another, related item is named:
"DOG FOOD: Meaty Bites (Pal) 22kg bag"
The problem is that I have lists of items from a number of suppliers, with updated prices, which I need to match to our existing stock list. The first time I get a list, I want to try and do a "closest match" search and present the user with a list of our current stock item names, which might match the suppliers stock item name. The user will then choose the correct SKU and the app will import the supplier item and link to our Stock table PK.
The name from the supplier will also vary. An example is:
"Pal Meaty Bites Chunks 8kg"
I can do the match in SQL or .NET, which ever you recommend. I want to present the user with as few items as possible based on greatest number of keywords match. My ideas so far are:
In .Net: break into an array and search each keyword for each item (slow) In SQL: Use a full text index and split the name into keywords using "OR" return list on rank with cutoff
This must be a common scenario, I'm just not sure of the best way to do it. Thanks for your input!
Edit: Added some context: We have a SKU table which has about 20 fields, including StockKeepingUnitID, which is the unique PK (int identity). The suppliers products are pulled into a table called StockOrderUnit, which has a FK of SupplierID and StockKeepingUnitID, and has a field called SupplierCode (varchar) which contains the Suppliers unique code for that stock item. The problem is that numerous suppliers send us price lists and it is up to a user to match the supplier items (that are unknown at this point) to the existing SKU's in the DB already. Once they select one, the records are joined.