ansaurus

Question

Vertex shader world transform, why do we use 4 dimensional vectors?

Answer 1

+4 A:

Rotation is specified by a 3 dimensional matrix and translation by a vector. You can perform both transforms in a "single" operation by combining them into a single 4 x 3 matrix:

rx1 rx2 rx3 tx1
ry1 ry2 ry3 ty1
rz1 rz2 rz3 tz1

However as this isn't square there are various operations that can't be performed (inversion for one). By adding an extra row (that does nothing):

0   0   0   1

all these operations become possible (if not easy).

ChrisF 2009-10-22 08:21:58

Ah, a much better answer than mine :) +1 (Here's a page I found with some more info http://planning.cs.uiuc.edu/node104.html)

Mark Pim 2009-10-22 08:24:32

Does this have anything to do with why its a 4-vector output from a vertex shader? It explains why 4 vectors but you make no mention of perspective divide ...

Goz 2009-10-22 08:26:04

Answer 2

+2 A:

Because you need the w coordinate for perspective calculation. After you output from the vertex shader than DirectX performs a perspective divide by dividing by w.

Essentially if you have 32768, -32768, 32768, 65536 as your output vertex position then after w divide you get 0.5, -0.5, 0.5, 1. At this point the w can be discarded as it is no longer needed. This information is then passed through the viewport matrix which transforms it to usable 2D coordinates.

Edit: If you look at how a matrix multiplication is performed using the projection matrix you can see how the values get placed in the correct places.

Taking the projection matrix specified in D3DXMatrixPerspectiveLH

2*zn/w  0       0              0
0       2*zn/h  0              0
0       0       zf/(zf-zn)     1
0       0       zn*zf/(zn-zf)  0

And applying it to a random x, y, z, 1 (Note for a vertex position w will always be 1) vertex input value you get the following

x' = ((2*zn/w) * x) + (0 * y) + (0 * z) + (0 * w)
y' = (0 * x) + ((2*zn/h) * y) + (0 * z) + (0 * w)
z' = (0 * x) + (0 * y) + ((zf/(zf-zn)) * z) + ((zn*zf/(zn-zf)) * w)
w' = (0 * x) + (0 * y) + (1 * z) + (0 * w)

Instantly you can see that w and z are different. The w coord now just contains the z coordinate passed to the projection matrix. z contains something far more complicated.

So .. assume we have an input position of (2, 1, 5, 1) we have a zn (Z-Near) of 1 and a zf (Z-Far of 10) and a w (width) of 1 and a h (height) of 1.

Passing these values through we get

x' = (((2 * 1)/1) * 2
y' = (((2 * 1)/1) * 1
z' = ((10/(10-1)  * 5 + ((10 * 1/(1-10)) * 1)
w' = 5

expanding that we then get

x' = 4
y' = 2
z' = 4.4
w' = 5

We then perform final perspective divide and we get

x'' = 0.8
y'' = 0.4
z'' = 0.88
w'' = 1

And now we have our final coordinate position. This assumes that x and y ranges from -1 to 1 and z ranges from 0 to 1. As you can see the vertex is on-screen.

As a bizarre bonus you can see that if |x'| or |y'| or |z'| is larger than |w'| or z' is less than 0 that the vertex is offscreen. This info is used for clipping the triangle to the screen.

Anyway I think thats a pretty comprehensive answer :D

Edit2: Be warned i am using ROW major matrices. Column major matrices are transposed.

Goz 2009-10-22 08:25:09

Answer 3

+1 A:

Clipping is an important part of this process, as it helps to visualize what happens to the geometry. The clipping stage essentially discards any point in a primitive that is outside of a 2-unit cube centered around the origin (OK, you have to reconstruct primitives that are partially clipped but that doesn't matter here).

It would be possible to construct a matrix that directly mapped your world space coordinates to such a cube, but gradual movement from the far plane to the near plane would be linear. That is to say that a move of one foot (towards the viewer) when one mile away from the viewer would cause the same increase in size as a move of one foot when several feet from the camera.

However, if we have another coordinate in our vector (w), we can divide the vector component-wise by w, and our primitives won't exhibit the above behavior, but we can still make them end up inside the 2-unit cube above.

For further explanations see http://www.opengl.org/resources/faq/technical/depthbuffer.htm#0060 and http://en.wikipedia.org/wiki/Transformation%5Fmatrix#Perspective%5Fprojection.

A simple answer would be to say that if you don't tell the pipeline what w is then you haven't given it enough information about your projection. This can be verified directly without understanding what the pipeline does with it...

As you probably know the 4x4 matrix can be split into parts based on what each part does. The 3x3 matrix at the top left is altered when you do rotation or scale operations. The fourth column is altered when you do a translation. If you ever inspect a perspective matrix, it alters the bottom row of the matrix. If you then look at how a Matrix-Vector multiplication is done, you see that the bottom row of the matrix ONLY affects the resultant w component of the vector. So if you don't tell the pipeline about w it won't have all your information.

DaedalusFall 2009-10-22 09:22:00

ansaurus

tags:

views:

answers:

Vertex shader world transform, why do we use 4 dimensional vectors?

related questions