I have a project where a significant piece of the pie will be to identify where a record is duplicated in the database (Sql Server 2005). I know the obvious ways to find a duplicate record. However, in this case, we want to be fairly smart about the process. The table(s) will contain information about a prospective customer(lead). The initial tables will accept all leads. We will then go through a dupe process which will check if the lead is a duplicate by matching on several fields. For example, we may want to match on the last name, first name, email and zip code. This is just an example, but essentially we want to create a key using various fields to know whether this is person exists. The records that are not dupes will go into a final table.
I would like to use SSIS for this, but am not sure the best way to use SSIS to accomplish this. Can someone steer me in the right direction or provide a link to an example that uses SSIS to deal with dupes by checking a combination of fields?