views:

33

answers:

3

Hi,

lets say I have a very big matrix with 10000x10000 elements all having the value '0'. Lets say there are some big 'nests' of '1's. Those areas might even be connected, but very weekly connected by a 'pipe' of '1's.

I want to get an algorithm that very quickly (and dirty if necessary) finds these 'nests' of '1's. Here it shouldn't 'cut apart' two weekly connected 'nests'.

Any idea how I should do such an algorithm?

+1  A: 

Maybe a pathfinding algorithm like A* (or something simpler like a BFS or DFS) may work in this case..

You can:

  • search starting point for your searches by finding small nests (ignoring pipes).. so at least a 3x3 block of 1's
  • then you should pathfind from there going through 1's until you end your "connected component" (poetic license) inside the matrix
  • repeat starting from another small 1's block
Jack
A: 

I would say it depends on how the data is needed. If, given two points, you need to check if they are in the same block of 1's, I think @Jack's answer is best. This is also true if you have some knowledge of where blocks are initially, as you can use those as starting points for your algorithm.

If you don't have any other information, maybe one of these would be a possibility:

If given a point, you wish to find all elements in the same block, a flood fill would be appropriate. Then you could cache each nest as you find it, and when you get another point first see if it's in a known nest, and if it isn't do a flood fill to find this nest then add it to the cache.

As an implementation detail, as you traverse the matrix each row should have available the set of nests present on the previous row. Then you would only need to check new points against those nests, rather than the complete set, to determine if a new point is in a known set or not.

Be sure that you use a set implementation with a very low lookup cost such as a hashtable or possibly a Bloom filter if you can deal with the probabilistic effects.

John
A: 
  1. Turn the matrix into a black&white bitmap
  2. Scale the matrix so that nests of size N become a single pixel (so if you look for 10x10 nests, scale by a factor of N=10).
  3. Use the remaining pixels of the output to locate the nests. Use the center coordinate (multiplied by the factor above) to locate the same nest in the matrix.
  4. Use a low-pass filter to get rid of all "pipes" that connect the nests.
  5. Find the border of the nest with a contrast filter on the bitmap.
  6. Create a bitmap which doesn't contain the nests (i.e. set all pixels of the nests to 0).
  7. Use a filter that widens single pixels to grow the outline of the nests.
  8. Bitwise AND the output of 7 and 5 to get the connection points of all pipes.
  9. Follow the pipes to see how they connect the nests
Aaron Digulla