I am working with a grid of squares which have two states, "ON" and "OFF." I have a rather simple Connected Component Labeling algorithm which finds all of the "ON" components. Usually, but not always, there is exactly one "ON" component.
I wish to construct an algorithm which takes in as input a matrix of on/off cells, a component labeling (probably formatted as a list of hashsets of cells), and a list of cells that have changed since the labeling was formed, and output a new labeling. The obvious solution is to just recalculate from scratch, though this is not efficient. In general the list of cells that have changed will be small.
In the case of the change list being only cells that have turned ON, this is easy to do:
Groups G;
Foreach changed cell C:
Group U = emptygroup;
U.add(C);
Foreach Group S in G:
if (S contains a cell which is adjacent to C)
G.Remove(S);
U.UnionWith(S);
G.add(C);
However, if the changes contain any cells which have turned off, I am not sure what to do. Keep in mind that all ON cells must be a member of exactly one group. So, one solution would be to take each cell that is adjacent to the newly OFF cell and see if they are connected to one another (e.g. using * pathfinding). This will yield 1-4 contiguous groups (unless the cell was the only cell in its group and thus has 0 adjacent cells to check, in which case it yields 0 groups). However, this is only a little better than starting from scratch since usually (but not always) connecting these adjacent squares together is about as hard as just finding a contiguous group (unless someone has a suggestion for a smart way to do that). Also, it is a bit scary to do if there are a lot of changed cells...though I admit there usually aren't.
Context, for those who insist on knowing why I am doing this: One rule in Nurikabe puzzles is that you may only have 1 contiguous group of walls. A simplficiation of a problem I am trying to solve to gain increased speed (and to play with pathfinding) is above. Basically, I wish to check for contiguous walls without wasting the information gained from previous such tests. I am trying to see how many places in my solver I can make use of previous information in order to enhance speed, since it seems kind of painful to use an O(f(N)) algorithm when an O(f(Δ)) algorithm will suffice (N being the size of the puzzle and Δ being the changes made since the algorithm was run last).
Profiling does indicate that improving this algorithm will make a difference to execution time, but this is a project for fun rather than for profit so it doesn't really matter that much, except to being able to measure whether the change had any influence.
Note: I have omitted explaining my current algorithm, but it basically works by doing a stack-based Flood Fill algorithm on the first ON square it finds, then checks to see if there are any more ON squares (which means there is more than one group, which it does not bother to examine).
Edit: Enhancement idea: Yairchu and John Kugelman's' suggestions crystalized in my head into this improvement, which is not actually a solution to this problem per se, but may make this part of the code and several other pieces of code run faster:
Current loop:
foreach (Square s in m.Neighbors[tmp.X][tmp.Y])
{
if (0 != ((byte)(s.RoomType) & match) && Retval.Add(s)) curStack.Push(s);
}
Improvement idea:
foreach (Square s in m.NeighborsXX[tmp.X][tmp.Y])
{
if (Retval.Add(s)) curStack.Push(s);
}
This would require maintaining several m.NeighborsXX instances (one for each type of match that needs to be enhanced) and updated them all whenever a square changed. I'd need to benchmark this to see if it actually helped, but it looks like a standard case of trading some memory for some speed.