views:

856

answers:

6

Hey math geeks, I've got a problem that's been stumping me for a while now. It's for a personal project.

I've got three dots: red, green, and blue. They're positioned on a cardboard slip such that the red dot is in the lower left (0,0), the blue dot is in the lower right (1,0), and the green dot is in the upper left. Imagine stepping back and taking a picture of the card from an angle. If you were to find the center of each dot in the picture (let's say the units are pixels), how would you find the normal vector of the card's face in the picture (relative to the camera)?

Now a few things I've picked up about this problem:

  1. The dots (in "real life") are always at a right angle. In the picture, they're only at a right angle if the camera has been rotated around the red dot along an "axis" (axis being the line created by the red and blue or red and green dots).
  2. There are dots on only one side of the card. Thus, you know you'll never be looking at the back of it.
  3. The distance of the card to the camera is irrelevant. If I knew the depth of each point, this would be a whole lot easier (just a simple cross product, no?).
  4. The rotation of the card is irrelevant to what I'm looking for. In the tinkering that I've been doing to try to figure this one out, the rotation can be found with the help of the normal vector in the end. Whether or not the rotation is a part of (or product of) finding the normal vector is unknown to me.

Hope there's someone out there that's either done this or is a math genius. I've got two of my friends here helping me on it and we've--so far--been unsuccessful.

+1  A: 

From the sounds of it, you have three points p1, p2, and p3 defining a plane, and you want to find the normal vector to the plane.

Representing the points as vectors from the origin, an equation for a normal vector would be
n = (p2 - p1)x(p3 - p1)
(where x is the cross-product of the two vectors)

If you want the vector to point outwards from the front of the card, then ala the right-hand rule, set
p1 = red (lower-left) dot
p2 = blue (lower-right) dot
p3 = green (upper-left) dot

BlueRaja - Danny Pflughoeft
The problem is that the points that I have are not three-dimensional. They are two-dimensional. I need to use the 2d points to determine the normal vector of the original plane. So for instance, if I were to take a picture of the cardboard and normalize the coordinates of the cardboard in the picture such that the red dot is at (0,0), i.e. I crop the picture---and knowing that the dots represent three dots "in real life" which form a right isosceles triangle---how could I determine what the normal vector is [of the cardboard "in real life"].
mattbasta
All n-dimensional points are also n+m-dimensional points for each integer m greater than or equal to one; simply set the additional components to 0.
Ignacio Vazquez-Abrams
There is some affine mapping ( http://en.wikipedia.org/wiki/Affine_transformation ) which maps the points on your cardboard-plane to the points on your viewing-plane. Assuming you know the width and height of the cardboard, you can create a set of linear equations to find it. Each equation would look like **m** * (x,y) + **b** = **p**, with **m** a linear transformation (3x2 matrix), (x,y) the coordinates of the point on the original cardboard-plane, **b** the offset (a 3x1 matrix), and **p** one of the points. This gives 9 equations and 9 unknowns, which is solvable.
BlueRaja - Danny Pflughoeft
@Ignacio: If you do that the computed "normal" is always along the z-axis, which is wrong.
dmckee
A: 

Hi mattbasta,

right, the normal vector does not change by distance, but the projection of the cardboard on a picture does change by distance (Simple: If you have a small cardboard, nothing changes. If you have a cardboard 1 mile wide and 1 mile high and you rotate it so that one side is nearer and the other side more far away, the near side is magnified and the far side shortened on the picture. You can see that immediately that an rectangle does not remain a rectangle, but a trapeze)

The mostly accurate way for small angles and the camera centered on the middle is to measure the ratio of the width/height between "normal" image and angle image on the middle lines (because they are not warped).

We define x as left to right, y as down to up, z as from far to near.

Then
x = arcsin(measuredWidth/normWidth) red-blue
y = arcsin(measuredHeight/normHeight) red-green
z = sqrt(1.0-x^2-y^2)

I will calculate tomorrow a more exact solution, but I'm too tired now...

Thorsten S.
It seem to me that this will only work if you know the range to the card (or if you are using a parallel projection (as well as the small angles limitation) because you have assumed that you know the expected height and width.
dmckee
@Thorsten: I looked over your work and it seems straightforward. I'm testing out a prototype of some Java code that give me sets of points like these. I'll plug your formula in and let you know how it turns out.@dmckee: For the small scale of this project, it can be assumed that parallel projection is used. The error that would be caused would be insignificant for the purposes of this experiment.
mattbasta
+1  A: 

Just thinking on my feet here.

Your effective inputs are the apparent ratio RB/RG [+], the apparent angle BRG, and the angle that (say) RB makes with your screen coordinate y-axis (did I miss anything). You need out the components of the normalized normal (heh!) vector, which I believe is only two independent values (though you are left with a front-back ambiguity if the card is see through).[++]

So I'm guessing that this is possible...

From here on I work on the assumption that the apparent angle of RB is always 0, and we can rotate the final solution around the z-axis later.

Start with the card positioned parallel to the viewing plane and oriented in the "natural" way (i.e. you upper vs. lower and left vs. right assignments are respected). We can reach all the interesting positions of the card by rotating by \theta around the initial x-axis (for -\pi/2 < \theta < \pi/2), then rotating by \phi around initial y-axis (for -\pi/2 < \phi < \pi/2). Note that we have preserved the apparent direction of the RB vector.

Next step compute the apparent ratio and apparent angle after in terms of \theta and \phi and invert the result.[+++]

The normal will be R_y(\phi)R_x(\theta)(0, 0, 1) for R_i the primitive rotation matrix around axis i.

[+] The absolute lengths don't count, because that just tells you the distance to card.

[++] One more assumption: that the distance from the card to view plane is much large than the size of the card.

[+++] Here the projection you use from three-d space to the viewing plane matters. This is the hard part, but not something we can do for you unless you say what projection you are using. If you are using a real camera, then this is a perspective projection and is covered in essentially any book on 3D graphics.

dmckee
+8  A: 

i worked it out in my old version of MathCAD:

alt text

Edit: Wording wrong in screenshot of MathCAD: "Known: g and b are perpendicular to each other"

In MathCAD i forgot the final step of doing the cross-product, which i'll copy-paste here from my earlier answer:

Now we've solved for the X-Y-Z of the translated g and b points, your original question wanted the normal of the plane.

If cross g x b, we'll get the vector normal to both:

        | u1  u2  u3 |
g x b = | g1  g2  g3 |
        | b1  b2  b3 |  

      = (g2b3 - b2g3)u1 + (b1g3 - b3g1)u2 + (g1b2 - b1g2)u3

All the values are known, plug them in (i won't write out the version with g3 and b3 substituted in, since it's just too long and ugly to be helpful.

But in practical terms, i think you'll have to solve it numerically, adjusting gz and bz so as to best fit the conditions:

g · b = 0

and

|g| = |b|

Since the pixels are not algebraically perfect.

Example

Using a picture of the Apollo 13 astronauts rigging one of the command module's square Lithium Hydroxide cannister to work in the LEM, i located the corners:

alt text

Using them as my basis for an X-Y plane:

alt text

i recorded the pixel locations using Photoshop, with positive X to the right, and positive Y down (to keep the right-hand rule of Z going "into" the picture):

g = (79.5, -48.5, gz)

b = (-110.8, -62.8, bz)

Punching the two starting formulas into Excel, and using the analysis toolpack to "minimize" the error by adjusting gz and bz, it came up with two Z values:

g = (79.5, -48.5, 102.5)

b = (-110.8, -62.8, 56.2)

Which then lets me calcuate other interesting values.

The length of g and b in pixels:

|g| = 138.5

|b| = 139.2

The normal vector:

g x b = (3710, -15827, -10366)

The unit normal (length 1):

uN = (0.1925, -0.8209, -0.5377)

Scaling normal to same length (in pixels) as g and b (138.9):

Normal = (26.7, -114.0, -74.7)

Now that i have the normal that is the same length as g and b, i plotted them on the same picture:

alt text

i think you're going to have a new problem: distortion introduced by the camera lens. The three dots are not perfectly projected onto the 2-dimensional photographic plane. There's a spherical distortion that makes straight lines no longer straight, makes equal lengths no longer equal, and makes the normals slightly off of normal.

Microsoft research has an algorithm to figure out how to correct for the camera's distortion:

A Flexible New Technique for Camera Calibration

But it's beyond me:

We propose a flexible new technique to easily calibrate a camera. It is well suited for use without specialized knowledge of 3D geometry or computer vision. The technique only requires the camera to observe a planar pattern shown at a few (at least two) different orientations. Either the camera or the planar pattern can be freely moved. The motion need not be known. Radial lens distortion is modeled. The proposed procedure consists of a closed-form solution, followed by a nonlinear refinement based on the maximum likelihood criterion. Both computer simulation and real data have been used to test the proposed technique, and very good results have been obtained. Compared with classical techniques which use expensive equipments such as two or three orthogonal planes, the proposed technique is easy to use and flexible. It advances 3D computer vision one step from laboratory environments to real world use.

They have a sample image, where you can see the distortion:

alt text

Note

  • you don't know if you're seeing the "top" of the cardboard, or the "bottom", so the normal could be mirrored vertically (i.e. z = -z)

Update

Guy found an error in the derived algebraic formulas. Fixing it leads to formulas that i, don't think, have a simple closed form. This isn't too bad, since it can't be solved exactly anyway; but numerically.

Here's a screenshot from Excel where i start with the two knowns rules:

g · b = 0

and

|g| = |b|

Writing the 2nd one as a difference (an "error" amount), you can then add both up and use that value as a number to have excel's solver minimize:

alt text

This means you'll have to write your own numeric iterative solver. i'm staring over at my Numerical Methods for Engineers textbook from university; i know it contains algorithms to solve recursive equations with no simple closed form.

Ian Boyd
I like this answer a lot. As for the top or bottom, I'm going to say that it's safe to assume that it's always pointing "up", but regardless, that's easy to determine with some simple math down the line (angle of the blue dot to the green dot around the red dot vs angle of either the green or blue dot).Thanks a lot!
mattbasta
The problem with this is that we don't know how *"unit length"* translates from our old coordinate system to the new one - that is why `sqrt( 1 - g1^2 - g2^2 )` looks like it will not be real very often. (for instance, say our dots are 1-meter apart, and we take a picture of it at some weird angle - how many pixels correlate to one meter?)
BlueRaja - Danny Pflughoeft
Good point. i've graphically solved it, again, now not knowing the length of the two lines. i'll work out the math today and post it this evening. Short version: Translate red to origin. |G| = |B|, G.B=0
Ian Boyd
+1 for very simple and correct answer. A few things to note: `1.` This answer (correctly) states that there is more than one possible orientation for the board, given only those three points. In fact, no matter how many points you are given, there will be more than one answer (with only one picture). `2.` We still don't know how far away the board is from the camera, since bz and gz are both translated by rz, which is unknown. `3.` This only works if g.b = 0 (that is, if angle grb is a right-angle). A site note @Ian: please delete the *Old Answer*, it is not really necessary.
BlueRaja - Danny Pflughoeft
@BlueRaja, Matt originally only wants a vector which is normal to the plane. Because of this, the "depth" of r,g,b doesn't matter, since a simple translation will not change the orientation of the plane, or its normal. Also, i thought i should keep the older answer as it contains some useful thought processes, and since you pointed out the "unit length" problem, the original answer servers to show the evolution of the final answer.
Ian Boyd
Also, if we did know how long a "unit" was, in pixels, then the other answer can be applied. So perhaps someone can find it useful.
Ian Boyd
@Ian: I realize the depth doesn't matter, that's why I gave you +1 for a correct answer :) I was only emphasizing that point. However, as for the *Old Answer* : since the "length" of any group of pixels depends on the depth of the physical object, any object whose depth changes (ie. is at an angle *wrt* the camera) will not have a consistent units/pixel, so the old equation is still rather unhelpful.
BlueRaja - Danny Pflughoeft
Otherwise, I think this is a fantastic answer (especially the pictures), thank you!
BlueRaja - Danny Pflughoeft
Very awesome! Thanks a ton!
mattbasta
Just thought I'd add an example of why adding more points doesn't help the ambiguity: http://www.illusion-optical.com/Optical-Illusions/Cube.php In this picture is the normal to the top of the cube facing up and towards the camera, or up and away from the camera?
BlueRaja - Danny Pflughoeft
Last night I ran through the numbers with my friend Carrie and we managed to replicate your success. Hooray! We did come up with a little trick, however, and that's how to decide whether the g_z and b_z values are positive or negative. We simply make them positive if their *_y value is positive and vice versa. It's very fast and works perfectly.Thanks again!
mattbasta
@mattbasta: In a digital picture, all y-values are positive. The trick is to know where the 'horizon' is - the line that where the ceiling and floor (or sky/ground) would meet if they extended on forever. If you call that line y=0, then your trick works (try taking a picture of the same object above/below the camera's horizon and see)
BlueRaja - Danny Pflughoeft
@BlueRaja Not if you normalize the origin of the coordinate system to the center dot. Since the camera sees with perspective, the ray between any point in the picture and a point in real life can be used to define a horizontal plane that extends from the camera outward. If we define this as the XY plane, then anything above it has a positive Z value and vice versa.
mattbasta
There is no notion of "horizon". Take the picture i used as a sample. **a)** you don't know what part of the picture i took it from **b)** The camera could have been rotated to some arbitrary angle, and **c)** There's no horizon because **it's taken in space!** The only special point could be along the lens's axis; but that's only useful if you're going to correct for lens distortion. If not, any affine transformation of the source points is fine.
Ian Boyd
I believe that this is essential what I proposed, but working code trumps speculation every time. +1
dmckee
It's a shame the amount of rep you get for an answer is not proportional to the amount of work put into said answer. :)
Ian Boyd
+1: **What an effort.**
WhatIsOpenID
A: 

You could use u,v,n co-oridnates. Set your viewpoint to the position of the "eye" or "camera", then translate your x,y,z co-ordinates to u,v,n. From there you can determine the normals, as well as perspective and visible surfaces if you want (u',v',n'). Also, bear in mind that 2D = 3D with z=0. Finally, make sure you use homogenious co-ordinates.

Steve F.
A: 

@ Ian Boyd...I liked your explanation, only I got stuck on step 2, when you said to solve for bz. You still had bz in your answer, and I don't think you should have bz in your answer...

bz should be +/- square root of gx2 + gy2 + gz2 - bx2 - by2

After I did this myself, I found it very difficult to substitute bz into the first equation when you solved for gz, because when substituting bz, you would now get:

gz = -(gxbx + gyby) / sqrt( gx2 + gy2 + gz2 - bx2 - by2 )

The part that makes this difficult is that there is gz in the square root, so you have to separate it and combine the gz together, and solve for gz Which I did, only I don't think the way I solved it was correct, because when I wrote my program to calculate gz for me, I used your gx, and gy values to see if my answer matched up with yours, and it did not.

So I was wondering if you could help me out, because I really need to get this to work for one of my projects. Thanks!

Samir Silbak
You're right, i did have an error. i corrected it, and i also get the recursive problem of `gz` being expressed in terms of `gz`. i took the easy way and solved for `gz` and `bz` numerically (i.e. using numerical methods) in Excel. i created the two starting assumptions: 1) g and b are normal: `g*b=0` 2) g and b are the same length (`|g|=|b|`). i then had Excel minimize the error in meeting those conditions and it solved `g=(79.5,-48.5,103.3) b=(-110.8,-62.8,55.8)` Remarkably close to the original answer, considering the formulas was wrong. Is that the values you got?
Ian Boyd
No, I couldn't get the values, because I wrote mine in C++, so when I tried to solve for g_z, this was the formula: g_z = -(g_xb_x + g_yb_y) / sqrt( g_x^2 + g_y^2 + g_z^2 - b_x^2 - b_y^2 )...and you can see in that formula...I still need the value of g_z just to solve for g_z...and in general, you can never get a value of something that is unknown, when the unknown variable is withing the equation...so I was never able to get the right value of g_z.
Samir Silbak
but I could do what you did, and write a recursive function, to brute force these values in my program, so I might just do that, because solving for g_z is too difficult
Samir Silbak