views:

682

answers:

4

Hi, I dunno if this should go in a Math forum or a Programming forums, but I'll post it in both and see where I get.

I have two computer images... one of them is an "original" image (a large TIF file). The other one is a transformed version of the original image... it's been rotated, sheared and translated in a software program. I need to do some work on the transformed image, but I need the (x-y) coordinates of each pixel in the original image to finish my calculations.

I know that the image was rotated and sheared with a 3x3 Transformation matrix. If I had the matrix, I could derive the second image from the first (or vice-versa) myself. I don't know exactly how much it was rotated, sheared, or translated, so I can't just derive the matrices from a set of known transformations. What I do have is a set of corresponding points (the corners, et al) in each image, and their corresponding (x,y) coordinates. So here's my dilemma:

Using a set of corresponding transformed points ((x,y) -> (x',y'), three or more of them), can I derive the Transformation matrix that was used to turn one image into the other? If I can derive the matrix, I can solve for the original coordinates of all the pixels (all 18-million of 'em) and get the calculations done that I need to do.

Can anyone help? I'm familiar with linear algebra... just not familiar enough to derive this without a whole lotta head scratching. Anything is appreciated!

  • Mike
+1  A: 

I think you should start by providing a list of, say 6 3 points (for 6 unknowns) with X/Y coordinates before and after transformation.

Then somebody more clever than I should pop that into a set of linear equations and then feed it to (say) Wolfram Alpha for solving.

The top of Java's documentation for AffineTransform shows how the matrix needs to be set up:

[ x']   [  m00  m01  m02  ] [ x ]   [ m00x + m01y + m02 ]
[ y'] = [  m10  m11  m12  ] [ y ] = [ m10x + m11y + m12 ]
[ 1 ]   [   0    0    1   ] [ 1 ]   [         1         ]

Removing most of the fluff leaves:

[ x']   [ m00x + m01y + m02 ]
[ y'] = [ m10x + m11y + m12 ]

Then you just set up a set of 6 x 2 equations like this:

m00x + m01y + m02 - x' = 0
m10x + m11y + m12 - y' = 0

(repeat for 2 other x/y before/after pairs)

and throw them at an equation solver.

Carl Smotricz
This is a simplification of my answer where you simply drop the c3 and c7 terms. But, good work spelling the actual linear algebra involved. Your "equation solver" is my "Gaussian Elimination".
Frank Krueger
Thank you! I'm happy to see I came close. I think I did my last GE 30 years ago; since then I've been happy to let a program do it. Glad to see Mike was taken care of by a pro.
Carl Smotricz
+2  A: 

Not sure if you want manual or automatic...

Manual

If you specify the transformed coordinates of the four corners of your rectangle, then you can derive the transformation equations:

alt text

(From Pierre Wellner's Interacting with Paper on the DigitalDesk and more details in his Thesis)

Now you just have to solve for the coefficients of the equation.

With four point pairs, the two sets of four simultaneous linear equations can be quickly solved by Gaussian Elimination to find the values of c1-8.

Lastly, you can turn those equations into the 3x3 matrices you want. The above equations are powerful enough to do non-linear transformations and you can simplify it into the 3x3 affine shear matrix.

But I would just stick with the nonliner equations (above) since they can handle perspective distortion.

Automatic

Same method, but you can use an edge-detector comboined with a line detection algorithm to find a set of 4-ish lines that makeup a rectangle.

If your image rectangles really stand out (whiteish images on a dark background), then you can use corner detection available from libraries like OpenCV's Feature Detection (see cvCornerHarris).

You can intersect those lines to find the four corners and use the transformation equation.

Frank Krueger
I should let Mike speak for himself, but I for one feel hopelessly inadequate to understand your explanation. I think many of us are only passingly familiar with mathematics.
Carl Smotricz
Oh. Well Gaussian Elimination is well documented in books such as "Numerical Algorithms". But I would like to stress that this is far superior to 3x3 affine transformations as it can handle perspective distortion (a non-linear non-affine transformation). If there is interest, I can expand on my answer.
Frank Krueger
Comment Span. I would like to add that I have used this exact method in a production project to excellent effect. My apologies if I didn't do it justice, but don't let my glib description steer you away from this powerful technique. Please read the linked-to materials.
Frank Krueger
This is PERFECT, Frank. I don't think the explanation was glib, and it gives me a map for what I need to do. I'm certain there are perspective transformations going on in there. I can derive the c1-c8 constants from 4 non-linear points... they needn't be the corners, right? I just remembered that two of the corners are clipped in the second image, but it shouldn't matter in this case... there are a number of very distinct reference points I can use. I'll plug those in to convert the rest. I'm not married to the idea of using 3x3 matrices... I just needed a method to translate points.
Mike
I think your image link is broken and I have a feeling the lack of that image is why I seem to be missing the part about how to derive a set of non-linear equations from the two separate set of corresponding points. If you could expound on how to derive the transformation equations I think that would help my out a lot.
jpierson
+1  A: 

You only need 3 points to define a 3x3 transformation matrix. If you have the points (0,0), (0,1) and (1,0) and transform them by the matrix [a b c d e f 0 0 1], you'll get (c,f), (b,e) and (a,d).

Javier
Oh... we have 6 unknowns, but there are 2 equations for each pair of dots... so 3 pairs will give us 6 equations. Clever!
Carl Smotricz
A: 

Thank you all. I do think I'm gonna use Frank's method, simply because I believe--now that he brings it up--there are actually non-affine transformations going on in there. I was looking at the two images and realizing (the way it's transformed) that 3 points alone wouldn't be enough. Let's say (for just an example) that 3 points are simply rotated around the hypothetical z-axis, but the 4th corner is completely stretched out by some distance as well. Giving the first 3 points (which would give a rotation matrix, in this case) wouldn't accurately solve for all the pixels, because I believe it's a non-affine transformation going on with the 4th point. Giving 4 points in the equation Frank gave me will correctly solve it.

I apologize, then, that my original question was misstated (the 3x3 matrix problem). Y'all gave me perfectly good equations for what I actually asked, but Frank's explanation (now that I realize it) solves what I needed. Much appreciated! I will try to hang around these forums more often... mebbe I can answer someone else's question sometime (a bit of forum Karma, if you will). Much appreciated.

  • Mike
Mike