This could be terribly trivial, but I'm having trouble finding an answer that executes in less than n^2 time. Let's say I have two string arrays and I want to know which strings exist in both arrays. How would I do that, efficiently, in VB.NET or is there a way to do this other than a double loop?
The simple way (assuming no .NET 3.5) is to dump the strings from one array in a hashtable, and then loop through the other array checking against the hashtable. That should be much faster than an n^2 search.
Sort both lists. Then you can know with certainty that if the next entry in list A is 'cobble' and the next entry in list B is 'definite', then 'cobble' is not in list B. Simply advance the pointer/counter on whichever list has the lower ranked result and ascend the rankings.
For example:
List 1: D,B,M,A,I
List 2: I,A,P,N,D,G
sorted:
List 1: A,B,D,I,M
List 2: A,D,G,I,N,P
A vs A --> match, store A, advance both
B vs D --> B
D vs D --> match, store D, advance both
I vs G --> I>G, advance 2
I vs I --> match, store I, advance both
M vs N --> M
List 1 has no more items, quit.
List of matches is A,D,I
2 list sorts O(n log(n)), plus O(n) comparisons makes this O(n(log(n) + 1)).
If one of the arrays is sorted you can do a binary search on it in the inner loop, this will decrease the time to O(n log n)
If you sort both arrays, you can then walk through them each once to find all the matching strings.
Pseudo-code:
while(index1 < list1.Length && index2 < list2.Length)
{
if(list1[index1] == list2[index2])
{
// You've found a match
index1++;
index2++;
} else if(list1[index1] < list2[index2]) {
index1++;
} else {
index2++;
}
}
Then you've reduced it to the time it takes to do the sorting.