views:

2096

answers:

7

Given two image buffers (assume it's an array of ints of size width * height, with each element a color value), how can I map an area defined by a quadrilateral from one image buffer into the other (always square) image buffer? I'm led to understand this is called "projective transformation".

I'm also looking for a general (not language- or library-specific) way of doing this, such that it could be reasonably applied in any language without relying on "magic function X that does all the work for me".

An example: I've written a short program in Java using the Processing library (processing.org) that captures video from a camera. During an initial "calibrating" step, the captured video is output directly into a window. The user then clicks on four points to define an area of the video that will be transformed, then mapped into the square window during subsequent operation of the program. If the user were to click on the four points defining the corners of a door visible at an angle in the camera's output, then this transformation would cause the subsequent video to map the transformed image of the door to the entire area of the window, albeit somewhat distorted.

+1  A: 

There is a C++ project on CodeProject that includes source for projective transformations of bitmaps. The maths are on Wikipedia here. Note that so far as i know, a projective transformation will not map any arbitrary quadrilateral onto another, but will do so for triangles, you may also want to look up skewing transforms.

Shane MacLaughlin
+1  A: 

I think what you're after is a planar homography, have a look at these lecture notes:

http://www.cs.utoronto.ca/~strider/vis-notes/tutHomography04.pdf

If you scroll down to the end you'll see an example of just what you're describing. I expect there's a function in the Intel OpenCV library which will do just this.

Ian Hopkinson
+1  A: 

It's not clear at the time I write this precisely what you want

Assuming you just want to copy an image into another another image, but scale it to different widths and heights, there are a few techniques:

In all of these let the first region have heights w and h, and the second w' and h'.

Nearest neightbor: For all destination pixels (x,y), copy the source pixel at (round(x*w/w'), round(y*h/h')).

This will distort the image noticeably unless it's a nice w/w' is a nice ratio like 1/2,

Bilinear interpolation: For all destination pixels (x,y), blend (take a weighted average of) the source pixels at the floor and ceiling of (x*w/w'),(y*h/h'), weighted by the fractional part and (1-the fractional part) -- how close it is to one pixel vs another.

http://en.wikipedia.org/wiki/Bilinear_interpolation

This will distort the image less noticeably, but possibly blur it a bit.

There are other more complicated techniques in use:

But if you can go to the trouble of them, and can spare the efficiency hit, I'd use a Fourier transform instead.

Fourier transform: Perform a 2-d FFT of the source into a temporary buffer. Pad and/or truncate it to the new size (w',h'). Unfourier transform into the new destination. This is pretty much the best looking, but is a bit more expensive.

wnoise
+1  A: 

Here's how would do it in principle:

  • map the origin of A to the origin of B via a traslation vector t.
  • take unit vectors of A (1,0) and (0,1) and calculate how they would be mapped onto the unit vectors of B.
  • this gives you a transformation matrix M so that every vector a in A maps to M a + t
  • invert the matrix and negate the traslation vector so for every vector b in B you have the inverse mapping b -> M-1 (b - t)
  • once you have this transformation, for each point in the target area in B, find the corresponding in A and copy.

The advantage of this mapping is that you only calculate the points you need, i.e. you loop on the target points, not the source points. It was a widely used technique in the "demo coding" scene a few years back.

Sklivvz
You're describing a linear transformation, which allows translation, rotation, scaling, reflection and shear. Unfortunately a projective transformation is not in general linear (in the sense it's not just a matrix multiply).
Hugh Allen
Apparently I spoke too soon; "linear transformations" according to wikipedia (see linear map) don't even include translation. But my point stands that a projective transformation is more complicated than what you describe.
Hugh Allen
Uhm, a linear transformation is without the traslation element. Furthermore linear transformations certainly include projections (e.g. ((1,0),(0,0)) is a projection)
Sklivvz
In any case I never said that the single elements should be constants. They could be functions of the point, in which case calculating the inverses would be more complicated, but the principle would still apply.
Sklivvz
Update: the term I really meant was "Affine transformation".
Hugh Allen
BTW a projective transformation is not in general a projection.
Hugh Allen
Hugh, yep, an affine transformation would certainly be more general, but also much more complicated to explain. I tried to give a simple answer.Regarding projective transformations, I think they are overkill for what the author needs, but thanks for pointing it out
Sklivvz
Hugh, also projective transformations are apparently linear transformations:"Actually, it is bilinear because the composition of projections is a binary linear operator, similar to matrix multiplication.", from http://en.wikipedia.org/wiki/Projective_transformation
Sklivvz
In my answer, I show how to do it using linear algebra. Just need to add in the translation (1) and the projective (xy) factors. The more points that you have, the higher degrees of x and y you need.
Eyal
+1  A: 
b3
+2  A: 

If this transformation has to look good (as opposed to the way a bitmap looks if you resize it in Paint), you can't just create a formula that maps destination pixels to source pixels. Values in the destination buffer have to be based on a complex averaging of nearby source pixels or else the results will be highly pixelated.

So unless you want to get into some complex coding, use someone else's magic function, as smacl and Ian have suggested.

MusiGenesis
A: 

Using linear algebra is much easier than all that geometry! Plus you won't need to use sine, cosine, etc, so you can store each number as a rational fraction and get the exact numerical result if you need it.

What you want is a mapping from your old (x,y) co-ordinates to your new (x',y') co-ordinates. You can do it with matrices. You need to find the 2-by-4 projection matrix P such that P times the old coordinates equals the new co-ordinates. We'll assume that you're mapping lines to lines (not, for instance, straight lines to parabolas). Because you have a projection (parallel lines don't stay parallel) and translation (sliding), you need a factor of (xy) and (1), too. Drawn as matrices:

          [x  ]
[a b c d]*[y  ] = [x']
[e f g h] [x*y]   [y']
          [1  ]

You need to know a through h so solve these equations:

a*x_0 + b*y_0 + c*x_0*y_0 + d = i_0
a*x_1 + b*y_1 + c*x_1*y_1 + d = i_1
a*x_2 + b*y_2 + c*x_2*y_2 + d = i_2
a*x_3 + b*y_3 + c*x_3*y_3 + d = i_3

e*x_0 + f*y_0 + g*x_0*y_0 + h = j_0
e*x_1 + f*y_1 + g*x_1*y_1 + h = j_1
e*x_2 + f*y_2 + g*x_2*y_2 + h = j_2
e*x_3 + f*y_3 + g*x_3*y_3 + h = j_3

Again, you can use linear algebra:

[x_0 y_0 x_0*y_0 1]   [a e]   [i_0 j_0]
[x_1 y_1 x_1*y_1 1] * [b f] = [i_1 j_1]
[x_2 y_2 x_2*y_2 1]   [c g]   [i_2 j_2]
[x_3 y_3 x_3*y_3 1]   [d h]   [i_3 j_3]

Plug in your corners for x_n,y_n,i_0,j_0. Take the inverse of the 4x4 matrix and multiply it by the right side of the equation. The transpose of that matrix is P. You should be able to find functions to compute a matrix inverse and multiply online.

Where you'll probably have bugs:

  • When computing, remember to check for division by zero. That's a sign that your matrix is not invertible. That might happen if you try to map one (x,y) co-ordinate to two different points.
  • If you write your own matrix math, remember that matrices are usually specified row,column (vertical,horizontal) and screen graphics are x,y (horizontal,vertical). You're bound to get something wrong the first time.
Eyal