views:

31

answers:

2

I have a database of tables that I needed to do some comparison work on and sql server is limited to the means of doing string comparisons. I put all the data into lists and thought of using string.compare or string.contains but does not seem like it is working right. Perhaps someone has a better suggestion on how to do this. It is large amount of data and I need to be able to make some matches in order to avoid the manual checking of each string. Here is sample data and code;

string 1
adage.com via Drudge Report
Airdrie & Coatbridge Advertiser
Silicon
A NOVO SA

string 2
adage.com
Airdrie and Coatbridge Advertiser
Silicon.com
The A Novo

now these are typical examples that should match but I am not sure how to get this to work.

rough code implementation:

For i As Integer = 0 To list1.Count - 1
            For j As Integer = 0 To list2.Count - 1
                If list1.Item(i).Contains(list2.Item(j)) Then
                    outfile.WriteLine("found match")

                End If
            Next
        Next
+1  A: 

If I understand your requirement, you want to match if either a is a substring of b, or vice versa. So don't you need:

If list1.Item(i).Contains(list2.Item(j)) OR list2.Item(j).Contains(list1.Item(i))

The above will do a case sensitive comparison. If you want a case insensitive comparison, then you could do something like this:

If list1.Item(i).ToLower().Contains(list2.Item(j).ToLower()) OR
    list2.Item(j).ToLower().Contains(list1.Item(i).ToLower())
dcp
Thanks for the response. Do I have to concerned about case? will it detect differences in letter case.
vbNewbie
@vbNewBie - See my latest edit.
dcp
A: 

You'd want to use WHERE string1 LIKE '%' + string2 + '%' in your SQL.

Will A