3-D Silhouette-Finding in OpenGL

If we want to find silhouettes as we draw objects, rather than as a post-process on the z-buffer, then we should check all of the edges to see if either (a) one adjacent face is front-facing and the other is back-facing, or (b) the edge has only one adjacent face. To test if a face is front-facing, we need a vector from a point on that face to the camera and a normal vector. The face is front-facing iff the dot product of those two vectors is positive. The above algorithm sounds easy, but it's made more complicated by the need to transform between coordinate systems.

The formulas we'll develop here are also useful for doing our own lighting calculations, or for calculating the screen locations and sizes of things.

A Review of OpenGL Coordinate Systems

screen space (aka window coordinates). Usually has (0,0) in lower left. units are pixels.
view space. That's my name for it. Foley calls it Normalized Device Coordinates. This is very much like screen space but hasn't been scaled and translated to pixels. The range is [-1,1] in x and y.
world space. Usually tied to the "ground", but the definition of world space is not enforced by OpenGL; it depends on the implementation. It is the space at which one switches from glMatrixMode(GL_PROJECTION) to glMatrixMode(GL_MODELVIEW). World space is the coordinate system in which lighting calculations should be done.
object space. This varies from object to object, as glRotate, glScale, glTranslate, glPushMatrix, glPopMatrix, etc calls are made. It's the local coordinate system as we're drawing, at any given line of code in an OpenGL program.

OpenGL's glViewport call controls the transform from view space to screen space. OpenGL's projection matrix is the transform from world space to view space. Sometimes this matrix is fixed for all time, but in some programs (if camera moving) it changes every frame. It is an affine matrix (last row (0,0,0,1)) for parallel projection, but not for perspective projection. OpenGL's modelview matrix is the transform from object space to world space ("modelview" is really a misnomer; typically this matrix has nothing to do with viewing). The modelview matrix typically changes from object to object, and also from frame to frame, if objects are moving. It is almost always affine.

Pick a Common Coordinate System

We need to transform everything to a common coordinate system for these silhouette tests. The choice is arbitrary, but let's use world space. So we'll need to transform polygon points and normals from object space to world space, and transform the camera from screen space to world space. To compute these transformations, we could maintain our own copies of the transformation matrices in parallel with OpenGL, but that would be redundant and error-prone. Instead, we can use glGet to query OpenGL's state to read these matrices.

Transform the Camera Position

The camera is at (0,0,-infinity) in screen space, by definition. That's equivalent to the homogeneous point (0,0,-1,0) or (0,0,1,0). (In projective geometry, -infinity=+infinity). If we let

Mwv = 4x4 matrix that transforms world space to view space
Mws = 4x4 matrix that transforms world space to screen space
Msw = 4x4 matrix that transforms screen space to world space

Then to find the camera position in world space, we take the position of the camera in screen space, (0,0,1,0), and transform it to world space. Using the SVL library

Vec4 cw4 = Msw*Vec4(0,0,1,0);	// matrix times column vector
Vec3 cw = proj(cw4);		// homogeneous divide

or simply

Vec3 cw = proj(Msw*Vec4(0,0,1,0)); // camera position in world space

For corroboration of this formula, see Trace::trace(x,y) in a3/raytrace/trace.cc . Note that this formula won't work if Msw is affine (uses parallel projection, not perspective). If we used it in that situation, we'd end up dividing by zero. I'll let you work out the correct formula for that case.

Recall that in SVL, if M is a matrix and v is a vector, M*v means matrix times column-vector, and v*M (which we'll use later) means row-vector times matrix. SVL regards the vector as either a row or column depending on context.

Recovering the Screen-to-World Transform

But where do we get Msw? In OpenGL, the world-to-view transform is the transpose of the projection matrix:

Mat4 Mwvt;
glGetDoublev(GL_PROJECTION_MATRIX, Mwvt.Ref());
// need the .Ref() because Mat4 is a high level 4x4 matrix class and
// glGetDoublev in this context wants a ptr to 16 consecutive doubles
Mat4 Mwv = trans(Mwvt);		// world-to-view transform

Recall that OpenGL stores all of its matrices transposed relative to standard form (an unfortunate anachronism).

We can compute the view-to-screen transform easily ourself:

Mat4 scale, tran;
scale.MakeHScale(Vec3(width/2., height/2., 1.));
tran.MakeHTrans(Vec3(width/2., height/2., 0.));
Mat4 Mvs = tran*scale;		// view-to-screen transform

width and height are the width and height of the window, in pixels. (In the subsurf code from Assignment 2 they can be found in the View class). If our code calls glViewport, our view-to-screen transform will be consistent with the above. If not, try glGetIntegerv(GL_VIEWPORT...) to recover the viewport parameters.

The world-to-screen transform is then simply Mvs*Mwv. As a shortcut, the screen-to-world transform can be calculated directly, without Mwv or Mvs:

(get Mwvt, calc. scale & tran as above)
Mat4 Msw = inv(tran*scale*trans(Mwvt)); // screen-to-world transform

Transform a Point

Next, we need to compute a vector from a point on the polygon to the camera. To do that, we'll first transform an object point from object space to world space, then take the difference of the camera position (in world space) and the object point position (in world space).

We'll need two more matrices:

Mow = 4x4 matrix that transforms object space to world space
Mwo = 4x4 matrix that transforms world space to object space

The first of these is simply the transpose of OpenGL's modelview matrix:

Mat4 Mowt;
glGetDoublev(GL_MODELVIEW_MATRIX, Mowt.Ref());
Mat4 Mow = trans(Mowt);

And the latter, Mwo, is the inverse of Mow.

Transforming the object space point po to world space is easy and familiar:

Vec3 po;				// point in object space
Vec3 pw = proj(Mow*Vec4(po,1));	// point in world space

Since po is a point (not a direction) we add homogeneous coordinate 1, transform by the 4x4 object-to-world matrix. Don't forget to do the homogeneous division after transformation. If Mow is affine (as it is, typically), we'll end up dividing by 1, so simply extracting the first three coordinates suffices in that case.

The direction of the camera from the point, in world space, is then cw-pw.

Transform a Normal

We also need to transform face normals for backface testing. Given a face normal "no" in object space, should we transform to world space it like this: proj(Mow*Vec4(no,1))? NO, way off! How about like this (changing the 1 to a 0): proj(Mow*Vec4(no,0))? AGAIN, NO! The latter formula only works in the unlikely event that Mow is a rotation only.

It turns out the correct formula to transform a normal to world space is:

Vec3 no;				// normal in object space
Vec3 nw = first3(Vec4(no,0)*Mwo);	// normal in world space

Above we're multiplying a row vector times a matrix. Where does this formula come from? Recall that transforming a normal (or a plane) requires multiplication of the coefficients of the plane (a,b,c,d) or the normal (a,b,c,0) on the right by the inverse of the point transformation matrix, or equivalently, multiplication on the left by the inverse-transpose. In the 15-462 notes on transformations, see the slides on transforming normals. This is important stuff!

The hypothetical subroutine first3() used above simply extracts the first three elements of a Vec4 to form a Vec3. The fourth coordinate, the plane offset coefficient d, is irrelevant to no and nw in the (typical) case that Mwo is an affine transformation, since parallel planes transform to parallel planes through an affine transformation. Note that it would be inappropriate to do a homogeneous division here, since that could change the sign of the normal coefficients and flip the normal's direction. For the purposes of backface testing, the lengths of no and nw are irrelevant; we need not normalize those vectors.

Putting it All Together

Given all of the above (whew!) we simply take the dot product of cw-pw and nw to determine if the face is front-facing or not. Finding silhouettes is easy once we have the above details worked out.

The above can be optimized by calculating matrices only when they change and caching transformed points and normals.

As an aside: once we've got Mow and Mws, we can simulate gluProject(), which transforms points from object space to screen space. Do that with Vec3 ps = proj(Mws*Mow*Vec4(po,1));

15-463, Computer Graphics 2
Paul Heckbert, 26 Apr 2001