tags:

views:

1156

answers:

3

I have an interesting problem, and I can't seem to figure out the lambda expression to make this work.

I have the following code:

List<string[]> list = GetSomeData(); // Returns large number of string[]'s
List<string[]> list2 = GetSomeData2(); // similar data, but smaller subset
&nbsp;
List<string[]> newList = list.FindAll(predicate(string[] line){ 
    return (???);
});

I want to return only those records in list in which element 0 of each string[] is equal to one of the element 0's in list2.

list contains data like this:

"000", "Data", "more data", "etc..."

list2 contains data like this:

"000", "different data", "even more different data"

Fundamentally, i could write this code like this:

List<string[]> newList = new List<string[]>();
foreach(var e in list)
{
    foreach(var e2 in list2)
    {
        if (e[0] == e2[0])
            newList.Add(e);
    }
}
return newList;

But, i'm trying to use generics and lambda's more, so i'm looking for a nice clean solution. This one is frustrating me though.. maybe a Find inside of a Find?

EDIT: Marc's answer below lead me to experiment with a varation that looks like this:

var z = list.Where(x => list2.Select(y => y[0]).Contains(x[0])).ToList();

I'm not sure how efficent this is, but it works and is sufficiently succinct. Anyone else have any suggestions?

+6  A: 

You could join? I'd use two steps myself, though:

var keys = new HashSet<string>(list2.Select(x => x[0]));
var data = list.Where(x => keys.Contains(x[0]));

If you only have .NET 2.0, then either install LINQBridge and use the above (or similar with a Dictionary<> if LINQBridge doesn't include HashSet<>), or perhaps use nested Find:

var data = list.FindAll(arr => list2.Find(arr2 => arr2[0] == arr[0]) != null);

note though that the Find approach is O(n*m), where-as the HashSet<> approach is O(n+m)...

Marc Gravell
What is the reason for the HashSet? It seems to work well without the has set (see my edit above). Does the HashSet make it more efficent?
Mystere Man
note; for very small lists, it *can* be more efficient to just scan the list [like your edit does]... but for small lists it is going to be very fast no matter what approach you use. As the list size increases, the scan approach can quickly become a bottleneck.
Marc Gravell
Ok. One piece of infromation I should mention is that the "keys" (or list2) will always be relatively small, probably less than 10. While the source (list) can be several hundred elements (up to 1000).
Mystere Man
But in general: much more efficient, yes. Firstly, it only keeps the *distinct* keys; secondly, it uses a hash algorithm (similar to dictionary) so that Contains tends to O(1) rather than O(n) [essentially, think of it as an "index" in database terms].
Marc Gravell
If you have < 10 keys (list2), then either approach should be fine.
Marc Gravell
I think I prefer your original approach, even though it may be fine now, one never knows how things could grow in the future. Thanks
Mystere Man
In fact, for 10 keys, the optimal approach would probably be: var keys = list2.Select(x => x[0]).ToArray(); var data = list.Where(x => keys.Contains(x[0]));
Marc Gravell
No problem; hope it helped.
Marc Gravell
+3  A: 

You could use the Intersect extension method in System.Linq, but you would need to provide an IEqualityComparer to do the work.

 static void Main(string[] args)
 {
  List<string[]> data1 = new List<string[]>();
  List<string[]> data2 = new List<string[]>();

  var result = data1.Intersect(data2, new Comparer());
 }

 class Comparer : IEqualityComparer<string[]>
 {
  #region IEqualityComparer<string[]> Members

  bool IEqualityComparer<string[]>.Equals(string[] x, string[] y)
  {
   return x[0] == y[0];
  }

  int IEqualityComparer<string[]>.GetHashCode(string[] obj)
  {
   return obj.GetHashCode();
  }

  #endregion
 }
Ch00k
Interesting solution, but it's larger than the original problem ;)
Mystere Man
Depends on the context really. If this comes up repeatedly, put your equality comparer in a library assembly and you can use one simple call to Intersect to get the Intersection of your two lists. Later on if you need to, you can use the same comparer for Equals, or Except, or other uses
Ch00k
A: 

(q1) define lamda expression that represent a link list ? (2) lamda expression to find the number of elements in a given list ?

sudip kumar