views:

1059

answers:

4

This could be terribly trivial, but I'm having trouble finding an answer that executes in less than n^2 time. Let's say I have two string arrays and I want to know which strings exist in both arrays. How would I do that, efficiently, in VB.NET or is there a way to do this other than a double loop?

+3  A: 

The simple way (assuming no .NET 3.5) is to dump the strings from one array in a hashtable, and then loop through the other array checking against the hashtable. That should be much faster than an n^2 search.

Chris Hynes
It was probably best if they were in a hashset, dictionary, or list rather than an array in the first place.
Joel Coehoorn
+1  A: 

Sort both lists. Then you can know with certainty that if the next entry in list A is 'cobble' and the next entry in list B is 'definite', then 'cobble' is not in list B. Simply advance the pointer/counter on whichever list has the lower ranked result and ascend the rankings.

For example:

List 1: D,B,M,A,I
List 2: I,A,P,N,D,G

sorted:

List 1: A,B,D,I,M
List 2: A,D,G,I,N,P

A vs A --> match, store A, advance both
B vs D --> B D vs D --> match, store D, advance both
I vs G --> I>G, advance 2
I vs I --> match, store I, advance both
M vs N --> M List 1 has no more items, quit.
List of matches is A,D,I

2 list sorts O(n log(n)), plus O(n) comparisons makes this O(n(log(n) + 1)).

Phil H
+2  A: 

If one of the arrays is sorted you can do a binary search on it in the inner loop, this will decrease the time to O(n log n)

John Rasch
+2  A: 

If you sort both arrays, you can then walk through them each once to find all the matching strings.

Pseudo-code:

while(index1 < list1.Length && index2 < list2.Length)
{
   if(list1[index1] == list2[index2])
   {
      // You've found a match
      index1++;
      index2++;
   } else if(list1[index1] < list2[index2]) {
      index1++;
   } else {
      index2++;
   }
}

Then you've reduced it to the time it takes to do the sorting.

Daniel LeCheminant