Some clarification about clipping

Hello, I need some clarification about clipping. It is said clipping is done before the perspective division and the view volume is within the following boundary

w>= x >= -w,
w >= y >= -w,
w >= z >= -w.

1)Does it mean clipping is performed just before being transformed to normalized device coordinate.

2)Then for perspective transformation, clipping is performed after the frustum view volume converted to a rectangular parallelepiped. Is it right?

3)Then in homogeneous coordinate it will look as follows with w = -z ,( dx, dy, az + b, -z) where d is the distance along -z from eye. Is it that at this stage, clipping takes place?
4) The next step is: dividing by w to transform to cartesian coordinate , ( -dx/z. -dy/z, (az+b)/z , 1). So clippining takes place before this step. Is it right?

  1. It is said OpenGL does clipping in image space, then how to relate with the steps mentioned above if clipping is not performed in object space?
  2. Is this clipspace coordinate totally maintained by system or user can manipulate from application?

Thanks in advance

The result is “as if” clipping was performed in clip coordinates, prior to projective division (division by W).

An implementation need not actually “perform” clipping as a distinct operation, so long as it produces the correct result.

Specifying that clipping is performed prior to projective division essentially means that it doesn’t matter if W is zero at a vertex or within the primitive. If projective division were performed on the vertex coordinates prior to clipping, that could result in division by zero.

It also simplifies the calculation of interpolated vertex attributes for vertices generated by clipping. The interpolation needs to be perspective-correct; this would be impossible clipping used NDC and ignored the W coordinate.

Clipping is performed using the clip-space vertex coordinates. At that point, concepts such as “perspective transformation” or “view frustum” aren’t particularly meaningful.

If you have a projection matrix which transforms coordinates from “eye space” to clip space (this concept is enforced by the fixed-function pipeline, but not required when using a vertex shader), then the eye-space “view frustum” is simply the result of transforming the signed unit cube (whose vertex coordinates are all +/- 1) by the inverse of the projection matrix.

The actual rendering process deals with clip coordinates, normalised device coordinates and window coordinates. Other coordinate systems (e.g. object space and eye space) are either created by the programmer (when using a vertex shader) or provided as a convenience (when using the fixed-function pipeline). By and large, they “mean” whatever you want them to mean.

Vertex coordinates written to gl_Position by a vertex shader are in clip coordinates. It’s entirely up to the shader how those coordinates are generated. If the fixed-function pipeline is being used, the clip-space coordinates are obtained from the object-space coordinates (passed via e.g. glVertex4f() or glVertexPointer()) by transforming them via the model-view and projection matrices.

Conversion from clip coordinates to NDC involves dividing by W; this is step is completely fixed (i.e. it isn’t affected by any configurable parameters). Conversion from NDC to window coordinates is affected by the viewport transformation (glViewport()) and the depth range (glDepthRange()).

Hello GClements, thank you so much for the clarification. Now much clearer:at clipping stage perspective transformation doesn’t have meaning.It only have meaning after perspective division. Now if it said that at clipping stage the bounding volume coordinates are as follows:
w>= x >= -w,
w >= y >= -w,
w >= z >= -w.

Here x, y are the dimension of the near plane. Isn’t it?

[QUOTE=Lee_Jennifer_82;1280021]Hello GClements, thank you so much for the clarification. Now much clearer:at clipping stage perspective transformation doesn’t have meaning.It only have meaning after perspective division. Now if it said that at clipping stage the bounding volume coordinates are as follows:
w>= x >= -w,
w >= y >= -w,
w >= z >= -w.

Here x, y are the dimension of the near plane. Isn’t it?[/QUOTE]

No, X, Y, Z, and W are the four components of the vertex’s position. In clip space, each vertex effectively has its own extent.

[QUOTE=Lee_Jennifer_82;1280021]Hello GClements, thank you so much for the clarification. Now much clearer:at clipping stage perspective transformation doesn’t have meaning.It only have meaning after perspective division. Now if it said that at clipping stage the bounding volume coordinates are as follows:
w>= x >= -w,
w >= y >= -w,
w >= z >= -w.

Here x, y are the dimension of the near plane. Isn’t it?[/QUOTE]

x, y, z and w are the clip-space vertex coordinates.

After projective division, the above constraints become

1>= x/w >= -1,
1 >= y/w >= -1,
1 >= z/w >= -1.

i.e. the coordinates are constrained to the signed unit cube. But the clip-space equations are well-defined even if w=0, while the NDC equations aren’t.

If the clip-space coordinates were generated via transformation by a projection matrix generated by glFrustum(), then the equivalent eye-space constraints are (assuming that eye-space W is always 1):

right/nearVal >= -x/z >= left/nearVal
top/nearVal >= -y/z >= bottom/nearVal
farVal >= -z >= nearVal

Ok, clipping is performed in homogeneous coordinate to avoid division by zero. But can’t this also happen during perspective division? I’m a bit confused.

As w approaches zero, the set of points for which w>= x/y/z >= -w (i.e. those which pass the clip test) diminishes. The only point with w=0 which can pass the clip test is where x, y and z are also zero. But for a perspective projection where the near distance is greater than zero, z and w cannot be zero simultaneously (and for an orthographic projection, w is a non-zero constant).

With a typical perspective transformation, clipping against the near plane ensures that the clipped geometry has a minimum value for eye-space Z, and as clip-space W is proportional to eye-space Z, so the clipped geometry has a minimum value for clip-space W.

If you manually construct a degenerate perspective transformation with a near distance of zero, geometry which intersects the eye-space origin will have x=y=z=w=0, in which case the result of conversion to NDC will be undefined (zero divided by zero). But such a transformation will have NDC Z equal to 1 everywhere (except for the case where it’s undefined), so it wouldn’t have much practical use.

This is a large part of the reason why the near plane exists. Another reason is that it allows NDC Z to be used for the depth value. As the relationship between NDC Z and NDC X and Y (and thus window-space X and Y) is affine, the per-fragment depth test can be performed without need to perform a division per pixel.