Our application allows a user to enter company names that their organization works with. A current issue is that the way one user inputs the company name varies from user to user. We need to consolidate this data. Are there any proven approaches for tackling this problem?
+1
A:
The problem of data quality is generally referred to as Data Cleansing. There are many methods and tools in this area.
The best for you will depend on the extent of your problem and also on the technologies you use. But if I understand well, the data that are stored are OK, the problem is that user input data to search against with incorrect spelling? In this case fuzzy searching could help.
ewernli
2009-12-22 08:16:16
Fuzzy searching makes sense, do you have any suggestions as far as tools?
Rob
2009-12-22 09:08:56
Which technologies are you using?
ewernli
2009-12-22 12:21:04
Pretty much the standard .net/sql stack. Currently using Full Text queries, but that doesn't help much.
Rob
2010-05-06 15:35:15