views:

422

answers:

3

A friend asked me how to improve some code with LINQ. How would you do a character by character comparison between two strings to count the number of matches at an index? Here's the original code, could it be improved with LINQ?

private int Fitness(string individual, string target)
   {
       int sum = 0;
       for (int i = 0; i < individual.Length; i++)
           if (individual[i] == target[i]) sum++;
       return sum;
   }
+4  A: 
return Enumerable.Range(0, individual.Length)
                 .Count(i => individual[i] == target[i]);

A more fool-proof way would be (the above snippet will fail if target is shorter than individual):

return Enumerable.Range(0, Math.Min(individual.Length, target.Length))
                 .Count(i => individual[i] == target[i]);


I believe the code is correct as is. Enumerable.Range method takes two arguments. The first of which is the start index (should be 0), the second is the count of items. The complete code snippet to test and make sure:

class Program {
  static void Main(string[] args) {
      Console.WriteLine(Fitness("hello", "world"));
  }
  static int Fitness(string individual, string target) {
      return Enumerable.Range(0, Math.Min(individual.Length, target.Length))
                       .Count(i => individual[i] == target[i]);
  }
}
Mehrdad Afshari
Need to do Length-1 since arrays are zero-based
Jon Galloway
The second argument to `Range` is count, not last item.
Mehrdad Afshari
That's why you need to do Length-1. You will get an "Index out of range exception" when you evaluate individual[individual.Length]. e.g. If the array length is 1 then the only element is foo[0], and foo[1] does not exist.
Kirk Broadhurst
@Kirk: As I said, it's correct since `Enumerable.Range` expects the count, not length. It never tries to evaluate `individual[individual.Length]`. Did you actually test the code or just speculating?
Mehrdad Afshari
When I tested, I got index out of range unless I used Length-1
Jon Galloway
@Jon: can you provide the test cases?
Mehrdad Afshari
For the code golf win: a.ToCharArray().Where((c, i) => i < b.Length
Yuriy Faktorovich
+2  A: 

You could write something similar with LINQ, but since "Zip" isn't built in until .NET 4.0 it will be more code than we'd like, and/or not as efficient. I'd be tempted to leave it "as is", but I'd probably check target.Length to avoid an out-of-range exception.

Perhaps I'd make an extension method, though:

public static int CompareFitness(this string individual, string target)
{
   int sum = 0, len = individual.Length < target.Length
        ? individual.Length : target.Length;
   for (int i = 0; i < len; i++)
       if (individual[i] == target[i]) sum++;
   return sum;
}

Then you can use:

string s = "abcd";
int i = s.CompareFitness("adc"); // 2
Marc Gravell
Agreed on length check. How it's related to Zip though?
Mehrdad Afshari
Zip takes two sequences and brings them together term-by-term...?
Marc Gravell
Agree, Marc, but I'm accepting Mehrdad's answer because I wanted to see how this would be done in LINQ. Voted for your answer, too. Thanks!
Jon Galloway
I understand but how can `Zip` reduce code size in this case (relative to my answer)? At least for collections that support direct access with an index, it won't make much difference in code size.
Mehrdad Afshari
I don't have the 4.0 version of Zip to hand, but wouldn't it be something like `individual.Zip(target).Count(pair=>pair.X==pair.Y);` - but importantly we haven't had to mess with the Min (i.e. this replaces your **second** version)
Marc Gravell
Yep. That makes sense.
Mehrdad Afshari
A: 

How about a join, or did I misunderstand?

    static void Main(string[] args)
    {
        var s1 = "abcde";
        var s2 = "hycdh";
        var count = s1.Join(s2, c => c, c => c, (a,b) => a).Count();
        Console.WriteLine(count);
        Console.ReadKey();
    }
spender
Now compare "ababa" to "babab" - should be zero, but returns 12 ;-p
Marc Gravell
Erk. I should probably delete to avoid downmark...
spender
Aww. Thought I'd leave it as an example of the diff between zip and join.
spender