views:

381

answers:

2

I have a very simple function which takes in a matching bitfield, a grid, and a square. It used to use a delegate but I did a lot of recoding and ended up with a bitfield & operation to avoid the delegate while still being able to perform matching within reason. Basically, the challenge is to find all contiguous elements within a grid which match the match bitfield, starting from a specific "leader" square. Square is somewhat small (but not tiny) class. Any tips on how to push this to be even faster? Note that the grid itself is pretty small (500 elements in this test).

Edit: It's worth noting that this function is called over 200,000 times per second. In truth, in the long run my goal will be to call it less often, but that's really tough, considering that my end goal is to make the grouping system be handled with scripts rather than being hardcoded. That said, this function is always going to be called more than any other function.

Edit: To clarify, the function does not check if leader matches the bitfield, by design. The intention is that the leader is not required to match the bitfield (though in some cases it will).

Things tried unsuccessfully:

  • Initializing the dictionary and stack with a capacity.
  • Casting the int to an enum to avoid a cast.
  • Moving the dictionary and stack outside the function and clearing them each time they are needed. This makes things slower!

Things tried successfully:

  • Writing a hashcode function instead of using the default: Hashcodes are precomputed and are equal to x + y * parent.Width. Thanks for the reminder, Jim Mischel.
  • mquander's Technique: See GetGroupMquander below.
  • Further Optimization: Once I switched to HashSets, I got rid of the Contains test and replaced it with an Add test. Both Contains and Add are forced to seek a key, so just checking if an add succeeds is more efficient than adding if a Contains fails check fails. That is, if (RetVal.Add(s)) curStack.Push(s);

    public static List<Square> GetGroup(int match, Model grid, Square leader)
    {
        Stack<Square> curStack = new Stack<Square>();
        Dictionary<Square, bool> Retval = new Dictionary<Square, bool>();
        curStack.Push(leader);
        while (curStack.Count != 0)
        {
            Square curItem = curStack.Pop();
            if (Retval.ContainsKey(curItem)) continue;
            Retval.Add(curItem, true);
            foreach (Square s in curItem.Neighbors)
            {
                if (0 != ((int)(s.RoomType) & match))
                {
                    curStack.Push(s);
                }
            }
        }
        return new List<Square>(Retval.Keys);
    }
    

=====

    public static List<Square> GetGroupMquander(int match, Model grid, Square leader)
    {
        Stack<Square> curStack = new Stack<Square>();
        Dictionary<Square, bool> Retval = new Dictionary<Square, bool>();
        Retval.Add(leader, true);
        curStack.Push(leader);
        while (curStack.Count != 0)
        {
            Square curItem = curStack.Pop();

            foreach (Square s in curItem.Neighbors)
            {
                if (0 != ((int)(s.RoomType) & match))
                {
                    if (!Retval.ContainsKey(s))
                    {
                        curStack.Push(s);
                        Retval.Add(curItem, true);
                    }
                }
            }
        }
        return new List<Square>(Retval.Keys);
    }
+4  A: 

The code you posted assumes that the leader square matches the bitfield. Is that by design?

I assume your Square class has implemented a GetHashCode method that's quick and provides a good distribution.

You did say micro-optimization . . .

If you have a good idea how many items you're expecting, you'll save a little bit of time by pre-allocating the dictionary. That is, if you know you won't have more than 100 items that match, you can write:

Dictionary<Square, bool> Retval = new Dictionary<Square, bool>(100);

That will avoid having to grow the dictionary and re-hash everything. You can also do the same thing with your stack: pre-allocate it to some reasonable maximum size to avoid resizing later.

Since you say that the grid is pretty small it seems reasonable to just allocate the stack and the dictionary to the grid size, if that's easy to determine. You're only talking grid_size references each, so memory isn't a concern unless your grid becomes very large.

Adding a check to see if an item is in the dictionary before you do the push might speed it up a little. It depends on the relative speed of a dictionary lookup as opposed to the overhead of having a duplicate item in the stack. Might be worth it to give this a try, although I'd be surprised if it made a big difference.

if (0 != ((int)(s.RoomType) & match))
{
    if (!Retval.ContainsKey(curItem))
        curStack.Push(s);
}

I'm really stretching on this last one. You have that cast in your inner loop. I know that the C# compiler sometimes generates a surprising amount of code for a seemingly simple cast, and I don't know if that gets optimized away by the JIT compiler. You could remove that cast from your inner loop by creating a local variable of the enum type and assigning it the value of match:

RoomEnumType matchType = (RoomEnumType)match;

Then your inner loop comparison becomes:

if (0 != (s.RoomType & matchType))

No cast, which might shave some cycles.

Edit: Micro-optimization aside, you'll probably get better performance by modifying your algorithm slightly to avoid processing any item more than once. As it stands, items that do match can end up in the stack multiple times, and items that don't match can be processed multiple times. Since you're already using a dictionary to keep track of items that do match, you can keep track of the non-matching items by giving them a value of false. Then at the end you simply create a List of those items that have a true value.

    public static List<Square> GetGroup(int match, Model grid, Square leader)
    {
        Stack<Square> curStack = new Stack<Square>(); 
        Dictionary<Square, bool> Retval = new Dictionary<Square, bool>(); 
        curStack.Push(leader);
        Retval.Add(leader, true);
        int numMatch = 1;
        while (curStack.Count != 0)
        {
            Square curItem = curStack.Pop(); 
            foreach (Square s in curItem.Neighbors) 
            {
                if (Retval.ContainsKey(curItem))
                    continue;
                if (0 != ((int)(s.RoomType) & match))
                {
                    curStack.Push(s);
                    Retval.Add(s, true);
                    ++numMatch;
                }
                else
                {
                    Retval.Add(s, false);
                }
            }
        }
        // LINQ makes this easier, but since you're using .NET 2.0...
        List<Square> matches = new List<Square>(numMatch);
        foreach (KeyValuePair<Square, bool> kvp in Retval)
        {
            if (kvp.Value == true)
            {
                matches.Add(kvp.Key);
            }
        }
        return matches;
    }
Jim Mischel
+1 on the presized collections advice, that must be a big chunk of the performance cost.
mquander
Presized collections ended up making it slower or the same. I think most of the calls to "GetGroup" end up with a very tiny number of members. Writing a hashcode function seems to have helped, perhaps because my hashcode function is 3x faster than the native one and will never have collisions.
Brian
More precisely, my hash function has the minimum number of collisions possible, since it just uses a cached copy of the calculation, x + y * parent.Width;
Brian
+1 for your revised algorithm - this seems much better than the original.
Reed Copsey
I tried the revised algorithm, but it made no difference. I also tried using a two dictionary variation on your algorithm, which also made no difference.
Brian
That makes sense. The match check is going to be cheaper than the dictionary lookup.
Jim Mischel
mquander's method should perform better than my revised algorithm.
Jim Mischel
A: 

Here are a couple of suggestions -

If you're using .NET 3.5, you could change RetVal to a HashSet<Square> instead of a Dictionary<Square,bool>, since you're never using the values (only the keys) in the Dictionary. This would be a small improvement.

Also, if you changed the return to IEnumerable, you could just return the HashSet's enumerator directly. Depending on the usage of the results, it could potentially be faster in certain areas (and you can always use ToList() on the results if you really need a list).

However, there is a BIG optimization that could be added here -

Right now, you're always adding in every neighbor, even if that neighbor has already been processed. For example, when leader is processed, it adds in leader+1y, then when leader+1y is processed, it puts BACK in leader (even though you've already handled that Square), and next time leader is popped off the stack, you continue. This is a lot of extra processing.

Try adding:

foreach (Square s in curItem.Neighbors)
{
    if ((0 != ((int)(s.RoomType) & match)) && (!Retval.ContainsKey(s)))
    {
        curStack.Push(s);
    }
}

This way, if you've already processed the square of your neighbor, it doesn't get re-added to the stack, just to be skipped when it's popped later.

Reed Copsey
I tried that change. It doesn't help.
Brian
I think the reason checking if retval.containskey later doesn't help is that I end up checking again when I pop off the stack. I can't avoid that check because I may end up adding the same square to the stack more than once as adding a square to the stack does not add it to Retval.
Brian
Why not add it to retval right there, inside the conditional quoted in this answer?
mquander
@mquander: If he does that, given his current logic, that square's neighbors would never be found. He can't add it to RetVal until he processes the neighbors.
Reed Copsey
@Brian: At least add a check to avoid adding curItem back into the Stack - right now, you're adding way too much into your stack. It may not help enough, but that change should help.
Reed Copsey
@Reed: Given his current check up top, yes. I'm suggesting that for each neighbor who matches, he both adds it to the stack and to retval if it's not currently in retval. Then, when he pops a neighbor off the stack, that neighbor's been vetted already and needs no further checks.
mquander
@mquander: Your suggestion made it a bit faster. Thanks.
Brian