tags:

views:

443

answers:

5

I have not used many lambda expressions before and I ran into a case where I thought I could make slick use of one. I have a custom list of ~19,000 records and I need to find out if a record exists or not in the list so instead of writing a bunch of loops or using linq to go through the list I decided to try this:

for (int i = MinX; i <= MaxX; ++i)
{
    tempY = MinY;

    while (tempY <= MaxY)
    {
        bool exists = myList.Exists(item => item.XCoord == i && item.YCoord == tempY);

        ++tempY;
    }
}

Only problem is it take ~9 - 11 seconds to execute. Am I doing something wrong is this just a case of where I shouldn't be using an expression like this?

Thanks.

EDIT: Sorry. I should elaborate. I am creating a list of records with the for and while loop and checking if that record exists in myList. That is the only way I could think of going about it. I will reevaluate it and see what I come with up.

+7  A: 

It's not a problem with the lambda, it's a problem with your algorithm. You're looping through from MinX to MaxX which is how many? then you're looping through the same from MinY to MaxY, then you're looping through ~19,000 records. so if the X loop is 10 and the y loop is 10, then you're making 19,000*10*10 calls. It could be a lot worse.

Joel
+4  A: 

Your algorithm is inefficient. If you are doing multiple searches on the same list you need to:

  1. Sort the list appropriately (by your look up key).
  2. Use a binary search to find the right record.

Your other option is if memory isn't an issue and you want it to be really fast is to put the records into a Dictionary<Your_Key,Record>. That will give you the fastest access after you have it setup.

Kevin
This worked really well. Only took 1 second to do what I wanted.
Nathan
+2  A: 

Isn't this what you want?

myList.Where(
    item=>item.XCoord>=MinX
        &&item.XCoord<=MaxX
        &&item.YCoord>=MinY
        &&item.YCoord<=MaxY
)

All the resulting items will satisfy the exists criterion.

...or did I misunderstand?

spender
+11  A: 

This code makes no sense to me so it is really hard to say whether you're doing it wrong or not. Odds are good that yes, you're doing it wrong.

Rather than showing the code, try this. Suppose you had a method that did exactly what you want. What would the signature of that method be? Not the body, just the signature.

Suppose for example you want to ask the question "does this list of points contain a particular point?" Then the signature would be

bool Contains(List<Point> list, Point p)

Suppose you want to ask the question "does this list of points contain any point inside this rectangle?" Then the signature would be

bool ContainsAny(List<Point> list, Rectangle r)

Suppose you want to ask the question "what are the points that these two lists have in common?" then the signature would be

List<Point> Intersection(List<Point> l1, List<Point> l2)

and so on. State what you are trying to do more clearly and we can help you optimize that operation. Start with the signature.

Eric Lippert
Does my edit above help? Sorry for the confusion.
Nathan
+1 because it really looks like Nathan is manually creating `Point`, when he could really just store a generic list of points and use `ContainsAny`.
Jim Schubert
+1  A: 

I'm going to expand on Kevin's answer with a nice linq-based solution.

The original code effectively computed a two dimensional boolean array indicating if a coordinate existed in myList at the x & y coordinates for each array element.

The test used for each x & y can be expressed as a lambda function as such:

Func<int, int, bool> original =
    (x, y) =>
        myList.Exists(item => item.XCoord == x && item.YCoord == y);

This is inefficient as the Exists method is called, and hence the list is iterated, for each x & y coordinate tested. Whether or not the entire list is iterated over depends if a match is found or not. In a lot of cases there won't be a match so the entire list is visited multiple times.

It is therefore best to pre-compute a dictionary of dictionaries to determine if a coordinate exists in myList for any x & y coordinate.

var xyLookup =
    (from item in myList
     group item by item.XCoord into XCoords
     select new
     {
         X = XCoords.Key,
         YLookup = (from x in XCoords
                    group x by x.YCoord into YCoords
                    select new
                    {
                        Y = YCoords.Key,
                        Coords = YCoords
                    }).ToDictionary(a => a.Y, a => a.Coords)
     }).ToDictionary(b => b.X, b => b.YLookup);

xyLookup now allows the following lambda function to replace the original version.

Func<int, int, bool> refactored =
    (x, y) =>
        xyLookup.ContainsKey(x) && xyLookup[x].ContainsKey(y);

Pre-computing xyLookup takes some time so, according to my tests, if I have a 3x3 array and myList contains 3 coordinates then both methods are roughly the same speed. I would expect, though, that the actual array size and number of elements in myList would be much larger in practice.

If I have a 100x100 array with 10,000 coordinates in myList then xyLookup is approximately 91 times faster than the original method.

I love linq... :-)

Enigmativity