views:

109

answers:

3

This question is somewhat language-agnostic, but my tool of choice happens to be a numpy array.

What I am doing is taking the difference of two images via PIL:

img = ImageChops.difference(img1, img2)

And I want to find the rectangular regions that contain changes from one picture to another. Of course there's the built in .getbbox() method, but if there are two regions with changes it will return a box from one region to another, and if there are only 1 pixel changes in each corner it will return the whole image.

For instance consider the following where o is a non-zero pixel:

______________________
|o            ooo    |
|      oooo   ooo    |
|      o             |
|      o  o          |
|                    |
|     oo       o     |
|    o  o     ooo    |
|     oo     ooooo   |
|             ooo    |
|              o     |
|____________________|

I'd like to get 4x4-tuples containing the bounding boxes for each non-zero region. For the edge case of the

oooo
o
o  o

structure, I'm not terribly worried how that's handled - either getting both sections separately or together, because the bounds of the inverted-L shape will completely overlap the bounds of the single pixel.

I've never done anything this advanced with image processing so I wanted to get some input before I really write anything (and if there are pre-existing methods in the modules I'm already using, I welcome them!).

My psuedocode-ish version goes something like this:

for line in image:
   started = False
   for pixel in line:
      if pixel and not started:
         started = True
         save start coords
      elif started and not pixel:
         started = False
         save end coords (x - 1 of course)

This should give me a list of coordinates, but then I have to determine if the regions are contiguous. I could do that with a graph-type search? (We did plenty of DFS and BFS in Algorithms last semester) Of course I guess I could do that instead/in conjunction with my previous loops?

I won't be doing this on "large" images - they're pulled from a webcam and the best one I currently have does 640x480. At most I'd be doing 720p or 1080p, but that's far enough into the future that it's not a real concern.

So my question(s): Am I headed on the right path, or am I way off? And more important, are there any built-in functions that prevent me from re-inventing the wheel? And finally, are there any good resources I should look at (tutorials, papers, etc.) that will help out here?

Thanks!

+1  A: 

You could look for connected components in the image and then determine the bounding boxes of these components.

midtiby
I'm confused - how is this different than what I said I was doing?
Wayne Werner
+1  A: 

A clustering package (ie this) should be able to most of the work ( find connected pixels ). Finding the bounding box for a cluster is trivial then.

THC4k
How would clustering help (this package especially)? From what I gather reading the docstrings, it can only find data that are X distance away, and doesn't (seem) to store any X-Y data. Am I missing something?
Wayne Werner
+3  A: 

I believe scipy's ndimage module has everything you need...

Here's a quick example

import numpy as np
import scipy as sp
import scipy.ndimage.morphology

# The array you gave above
data = np.array( 
        [
           [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0], 
           [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0], 
           [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
           [0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
           [0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], 
           [0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0], 
           [0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0], 
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0], 
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], 
        ])


# Fill holes to make sure we get nice clusters
filled = sp.ndimage.morphology.binary_fill_holes(data)

# Now seperate each group of contigous ones into a distinct value
# This will be an array of values from 1 - num_objects, with zeros
# outside of any contigous object
objects, num_objects = sp.ndimage.label(filled)

# Now return a list of slices around each object
#  (This is effectively the tuple that you wanted)
object_slices =  sp.ndimage.find_objects(objects)

# Just to illustrate using the object_slices
for obj_slice in object_slices:
    print data[obj_slice]

This outputs:

[[1]]
[[1 1 1]
 [1 1 1]]
[[1 1 1 1]
 [1 0 0 0]
 [1 0 0 1]]
[[1]]
[[0 1 1 0]
 [1 0 0 1]
 [0 1 1 0]]
[[0 0 1 0 0]
 [0 1 1 1 0]
 [1 1 1 1 1]
 [0 1 1 1 0]
 [0 0 1 0 0]]

Note that the "object_slices" are basically what you originally asked for, if you need the actual indicies.

Edit: Just wanted to point out that despite it appearing to properly handle the edge case of

[[1 1 1 1]
 [1 0 0 0]
 [1 0 0 1]]

it actually doesn't (Thus the extra lone [[1]]). You can see this if you print out the "objects" array and take a look at objects 3 & 4.

[[1 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 0 0 0 0]
 [0 0 0 0 0 0 3 3 3 3 0 0 0 2 2 2 0 0 0 0]
 [0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 3 0 0 4 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 5 5 0 0 0 0 0 0 0 6 0 0 0 0 0]
 [0 0 0 0 5 5 5 5 0 0 0 0 0 6 6 6 0 0 0 0]
 [0 0 0 0 0 5 5 0 0 0 0 0 6 6 6 6 6 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 6 6 6 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0]]

Hope that helps!

[1]

Joe Kington
Holy smokes that's perfect! That's exactly what I wanted - and I think I actually prefer it to handle the edge case that way - that way it really will get the boxes for *all* the pixels. I wish I could up vote more than once!
Wayne Werner
@Wayne - Glad to help! There's quite a nice batch of functions in scipy.ndimage, once you learn how to string the various operators together. Good luck!
Joe Kington