views:

513

answers:

5

Note: I'm optimizing because of past experience and due to profiler software's advice. I realize an alternative optimization would be to call GetNeighbors less often, but that is a secondary issue at the moment.

I have a very simple function described below. In general, I call it within a foreach loop. I call that function a lot (about 100,000 times per second). A while back, I coded a variation of this program in Java and was so disgusted by the speed that I ended up replacing several of the for loops which used it with 4 if statements. Loop unrolling seems ugly, but it did make a noticeable difference in application speed. So, I've come up with a few potential optimizations and thought I would ask for opinions on their merit and for suggestions:

  • Use four if statements and totally ignore the DRY principle. I am confident this will improve performance based on past experience, but it makes me sad. To clarify, the 4 if statements would be pasted anywhere I called getNeighbors() too frequently and would then have the inside of the foreach block pasted within them.
  • Memoize the results in some mysterious manner.
  • Add a "neighbors" property to all squares. Generate its contents at initialization.
  • Use a code generation utility to turn calls to GetNeighbors into if statements as part of compilation.

    public static IEnumerable<Square> GetNeighbors(Model m, Square s)
    {
        int x = s.X;
        int y = s.Y;        
        if (x > 0) yield return m[x - 1, y];
        if (y > 0) yield return m[x, y - 1];
        if (x < m.Width - 1) yield return m[x + 1, y];
        if (y < m.Height - 1) yield return m[x, y + 1];
        yield break;
    }
    
    
    //The property of Model used to get elements.
    private Square[,] grid;
    //...
    public Square this[int x, int y]
    {
        get
        {
            return grid[x, y];
        }
    }
    

Note: 20% of the time spent by the GetNeighbors function is spent on the call to m.get_Item, the other 80% is spent in the method itself.

+1  A: 

I'd suggest making an array of Squares (capacity four) and returning that instead. I would be very suspicious about using iterators in a performance-sensitive context. For example:

// could return IEnumerable<Square> still instead if you preferred.
public static Square[] GetNeighbors(Model m, Square s)
{
    int x = s.X, y = s.Y, i = 0;
    var result = new Square[4];

    if (x > 0) result[i++] = m[x - 1, y];
    if (y > 0) result[i++] = m[x, y - 1];
    if (x < m.Width  - 1) result[i++] = m[x + 1, y];
    if (y < m.Height - 1) result[i++] = m[x, y + 1];

    return result;
}

I wouldn't be surprised if that's much faster.

mquander
I was in the process of suggesting the exact same thing. The use of IEnumerable introduces some hidden costs under the cover, and for only returning 4 elements ever, you don't need the flexibility and ease of use IEnumerable provides.
Michael
I'm discovering that even when I have many elements, ienumerable is just way too expensive. This wasn't the only function paying this price.
Brian
You might be interested in reading this blog entry describing the specific code generation that takes place for iterators and yield return. http://blogs.msdn.com/wesdyer/archive/2007/03/23/all-about-iterators.aspx
mquander
+1  A: 

I'm on a slippery slope, so insert disclaimer here.

I'd go with option 3. Fill in the neighbor references lazily and you've got a kind of memoization.

ANother kind of memoization would be to return an array instead of a lazy IEnumerable, and GetNeighbors becomes a pure function that is trivial to memoize. This amounts roughly to option 3 though.

In any case, but you know this, profile and re-evaluate every step of the way. I am for example unsure about the tradeoff between the lazy IEnumerable or returning an array of results directly. (you avoid some indirections but need an allocation).

Kurt Schelfthout
+4  A: 

Brian,

I've run into similar things in my code.

The two things I've found with C# that helped me the most:

First, don't be afraid necessarily of allocations. C# memory allocations are very, very fast, so allocating an array on the fly can often be faster than making an enumerator. However, whether this will help depends a lot on how you're using the results. The only pitfall I see is that, if you return a fixed size array (4), you're going to have to check for edge cases in the routine that's using your results.

Depending on how large your matrix of Squares is in your model, you may be better off doing 1 check up front to see if you're on the edge, and if not, precomputing the full array and returning it. If you're on an edge, you can handle those special cases separately (make a 1 or 2 element array as appropriate). This would put one larger statement in there, but that is often faster in my experience. If the model is large, I would avoid precomputing all of the neighbors. The overhead in the Squares may outweigh the benefits.

In my experience, as well, preallocating and returning vs. using yield makes the JIT more likely to inline your function, which can make a big difference in speed. If you can take advantage of the IEnumerable results and you are not always using every returned element, that is better, but otherwise, precomputing may be faster.

The other thing to consider - I don't know what information is saved in Square in your case, but if hte object is relatively small, and being used in a large matrix and iterated over many, many times, consider making it a struct. I had a routine similar to this (called hundreds of thousands or millions of times in a loop), and changing the class to a struct, in my case, sped up the routine by over 40%. This is assuming you're using .net 3.5sp1, though, as the JIT does many more optimizations on structs in the latest release.

There are other potential pitfalls to switching to struct vs. class, of course, but it can have huge performance impacts.

Reed Copsey
+1  A: 

Why not make the Square class responsible of returning it's neighbours? Then you have an excellent place to do lazy initialisation without the extra overhead of memoization.

public class Square {

 private Model _model;
 private int _x;
 private int _y;
 private Square[] _neightbours;

 public Square(Model model, int x, int y) {
  _model = model;
  _x = x;
  _y = y;
  _neightbours = null;
 }

 public Square[] Neighbours {
  get {
   if (_neightbours == null) {
    _neighbours = GetNeighbours();
   }
   return _neighbours;
  }
 }

 private Square[] GetNeightbours() {
  int len = 4;
  if (_x == 0) len--;
  if (_x == _model.Width - 1) len--;
  if (_y == 0) len--;
  if (-y == _model.Height -1) len--;
  Square [] result = new Square(len);
  int i = 0;
  if (_x > 0) {
   result[i++] = _model[_x - 1,_y];
  }
  if (_x < _model.Width - 1) {
   result[i++] = _model[_x + 1,_y];
  }
  if (_y > 0) {
   result[i++] = _model[_x,_y - 1];
  }
  if (_y < _model.Height - 1) {
   result[i++] = _model[_x,_y + 1];
  }
  return result;
 }

}
Guffa
If there aren't a lot of squares in the Model, this is great, but you're adding a lot of pointers if you have many squares. The "100,000/second" loops suggests a lot of squares, though, so I'd be watchful of memory, too.
Reed Copsey
A: 

Depending on the use of GetNeighbors, maybe some inversion of control could help:

public static void DoOnNeighbors(Model m, Square s, Action<s> action) {
  int x = s.X;
  int y = s.Y;        
  if (x > 0) action(m[x - 1, y]);
  if (y > 0) action(m[x, y - 1]);
  if (x < m.Width - 1) action(m[x + 1, y]);
  if (y < m.Height - 1) action(m[x, y + 1]);
}

But I'm not sure, if this has better performance.

MartinStettner
DoOnNeighbors sounds kinda pervy
StingyJack
Might be 4 times better since it uses one delegate instead of 4 enumerates.
Brian