A: 

it is impossible to know the width of this rectangle without knowing the distance of the 'camera'.

a small rectangle viewed from 5 centimeters distance looks the same as a huge rectangle as seen from meters away

Toad
Partially correct. Not only do you need to know the distance, you need to know the field of view of the camera as well. i.e. a typical 35mm camera has a view angle of 54 degrees with no zoom.
Neil N
one would probably also need to know the rotation, since it's unclear which side is up
Toad
I do not need the width, just the proportions, i.e. the quotient (width / height). The scale is of course dependent on the distance to the observer, but as far as I can tell, the proportions are not. a 1by1 square will map to different projections than a 1by2 rectangle, correct?
HugoRune
A: 

You need more information, that transformed figure could come from any parallelogram given an arbitrary perspective.

So I guess you need to do some kind of calibration first.

Edit: for those who said that I was wrong, here goes the mathematical proof that there are infinite combinations of rectangles/cameras that yield to the same projection:

In order to simplify the problem (as we only need the ratio of the sides) let's assume that our rectangle is defined by the following points: R=[(0,0),(1,0),(1,r),(0,r)] (this simplification is the same as transforming any problem to an equivalent one in an affine space).

The transformed polygon is defined as: T=[(tx0,ty0),(tx1,ty1),(tx2,ty2),(tx3,ty3)]

There exists a transformation matrix M = [[m00,m01,m02],[m10,m11,m12],[m20,m21,m22]] that satisfies (Rxi,Ryi,1)*M=wi(txi,tyi,1)'

if we expand the equation above for the points,

for R_0 we get: m02-tx0*w0 = m12-ty0*w0 = m22-w0 = 0

for R_1 we get: m00-tx1*w1 = m10-ty1*w1 = m20+m22-w1 = 0

for R_2 we get: m00+r*m01-tx2*w2 = m10+r*m11-ty2*w2 = m20+r*m21+m22-w2 = 0

and for R_3 we get: m00+r*m01-tx3*w3 = m10+r*m11-ty3*w3 = m20 + r*m21 + m22 -w3 = 0

So far we have 12 equations, 14 unknown variables (9 from the matrix, 4 from the wi, and 1 for the ratio r) and the rest are known values (txi and tyi are given).

Even if the system weren't underspecified, some of the unknowns are multiplied among themselves (r and mi0 products) making the system non linear (you could transform it to a linear system assigning a new name to each product, but you'll end still with 13 unknowns, and 3 of them being expanded to infinite solutions).

If you can find any flaw in the reasoning or the maths, please let me know.

fortran
But he knows its a rectangle. i.e. scanned documents.
Neil N
@Neil N So what? Maybe now the rectangles are not parallelograms and I haven't noticed...
fortran
because rectangles have all 90 degree corners, which takes the possible rotations down from infinity to one (well technically two if you consider he could be looking at the back side). A huge difference.
Neil N
but there's still an infinite number of different rectangles that can look the same if the correct perspective is applied.
fortran
that's what I was wondering. As far as I can tell, a rectangle with (width=2*height) has a different set of possible projections than a rectangle with (width=3*height). So looking at a given perspective projection, there will be an infinite number of possible rectangles, but they will all have the same ratio of width to height.
HugoRune
I think I'll have to find a counter-example
fortran
I can't follow your reasoning. The transformation in question is NOT just a rotation, so what is the justification for your "there exists" line? (And why not just give us a counter-example?)
Beta
If it were just a rotation I would have used a 2x2 matrix and cartesian coordinates, not a 3x3 and homogeneous coordinates. It's expected to know a little bit about perspective projections to follow the reasoning, hence the "there exists" line. A demonstration is better than a counter example, but if you want one, you can fill the values that are supposed given in the equations, solve them (doing the trick to assign new variables to the non-linear products to make them linear) and then you'll have all the counterexamples you can ever wish for.
fortran
I admit I had never heard of homogenous coordinates before, but now that I've read up a little on them it still seems that you are assuming what you're trying to prove. I will eat my words if you give a counter-example-- shall I construct the 2:1 rectangle and place the camera?
Beta
+1  A: 

Size isnt really needed, and neither are proportions. And knowing which side is up is kind of irrelevant considering he's using photos/scans of documents. I doubt hes going to scan the back sides of them.

"Corner intersection" is the method to correct perspective. This might be of help:

http://stackoverflow.com/questions/530396/how-to-draw-a-perspective-correct-grid-in-2d

Neil N
Thanks, but I am not sure if I understand this fully: Using the information given in the linked answer, I can map the quadrangle in the picture to an arbitrary rectangle, by subdividing at the intersection of the diagonals.What I would like to do is map the quadrangle to a rectangle with the correct proportions. So a picture of a square should be mapped only to a square. I am not sure how to get the ratio of sides. Googling for "corner intersection" did not work.
HugoRune
If you continue to intersect down until the rectangles are smaller than pixels, from there you can measure the height and width... then you would know how big to create your destination rectangle.. then just map backwards from there.
Neil N
I am not sure how that would work. When I intersect the original quadrangle n times, i will get 2^n * 2^n smaller quadrangles. Even if they are smaller than pixels, they still have the exact same proportions as the original quadrangle, and the original quadrangle will be exactly 2^n small_quadrangles high and 2^n small quadrangles wide. If I map each small quadrangle to a pixel, I will end up with a square.
HugoRune
If both height and width intersection became smaller than pixel height/width on the same iteration, then yes you would have a square. If Height took twice as many iterations as width, you have a 2:1 H:W ratio... get it?
Neil N
Sorry for being dense, but I do not get it at all. Using the examples shown here: http://freespace.virgin.net/hugo.elias/graphics/x_persp.htmIf I intersect the quadrangle ABCD into smaller and smaller similar sub-quadrangles, I will eventually get sub-quadrangles smaller than a pixel. But on which iteration that happens depends: close to the CD side, the sub-quadrangles will be smaller than the ones close to the AB side of the original quadrangle. So the value I get seems arbitrary, and I do not understand how this is related to the ratio of the undistorted rectangle.
HugoRune
What I'm saying is the subdivided quad may have it's HEIGHT smaller than a pixel, without its WIDTH being there yet. At which point you keep going until WIDTH is also smaller than a pixel. So you would have number of iterations that took to get height (lets say 10) and however many interations that got you to width (lets say 12) which means you have a 10:12 height to width ratio.
Neil N
A: 

Draw a right isosceles triangle with those two vanishing points and a third point below the horizon (that is, on the same side of the horizon as the rectangle is). That third point will be our origin and the two lines to the vanishing points will be our axes. Call the distance from the origin to a vanishing point pi/2. Now extend the sides of the rectangle from the vanishing points to the axes, and mark where they intersect the axes. Pick an axis, measure the distances from the two marks to the origin, transform those distances: x->tan(x), and the difference will be the "true" length of that side. Do the same for the other axis. Take the ratio of those two lengths and you're done.

Beta
I think I get it! Something like this: http://img39.imageshack.us/img39/4273/perspectivediagramisoskh.jpgI have to think about this a bit more, but at first glance I think that is exactly what I needed, thanks a lot! (by the way, I see that you simplified your answer a bit, but I found the original comments about the origin being the point below the camera, and assuming the camera to be at a distance of 1 very useful too)
HugoRune
I am trying to wrap my head around this method. Is it possible to extend it for the degenerate case, when one of the vanishing points is close to infinity, i.e. when two sides of the quadrangle are parallel or almost parallel?
HugoRune
Yes, that image captures it. This method is actually just approximate, and doesn't work well in some extreme cases. In the exact solution, the lines to the vanishing point aren't lines, they're curves (that's right, 2-point perspective is bunk), and the math is a little harder; I'll post some graphics if I can figure out how. If the figure is almost a rectangle, it's face-on and you can just do x->tan(x). If it's almost a parallelogram with non-right-angles, it's very small and you're sunk.
Beta
+1  A: 
Eric Bainville
Excuse me, but this doesn't look right. You appear to have moved the camera between these two cases, which will change the appearance of ABCD. Projecting onto a plane like this is only approximately correct at best, and you've broken the rules.
Beta
Yes, the eye is at the intersection of the red lines. You're right that the position of the camera changes between the two views. What does not change is the input of the problem: the projected ABCD.
Eric Bainville
Excuse me, but you're wrong. You're projecting onto the wrong plane. If I construct a 2:1 rectangle, give it position and orientation, and place the camera, do you think you can find a 3:1 rectangle that looks the same to the camera?
Beta
In the question as I understood it, we only have the projected rectangle as input (ABCD on the gray plane). We don't know anything about the projection, so we can assume it is defined by a point and a plane.Then the question can be restated as: do all the rectangles of the 3D space projecting into ABCD have the same w/h ratio?
Eric Bainville
Do you accept my challenge?
Beta
Without moving the camera, I don't think we can project a 2:1 and a 3:1 rectangle to the same ABCD in the general case. But as I said in a previous comment, this is not the original problem, where we don't know where the camera is.
Eric Bainville
+4  A: 
HugoRune
Thanks for the code and the nice problem!
Eric Bainville