Standard clip planes

In GLSL, we have to write to gl_ClipVertex in the vs.

So in case of position invariant, I should do

gl_Position = ftranform();
gl_ClipVertex = modelview * gl_Vertex;

so I’m writing to both, right?
I would have to do something similar if I’m computing gl_Position myself (skinning or something)?

What are the performance implications of this on all the hw that is out there?

I think it was nVidia or jwatte who recommended to burn a texture unit and do it ourselves in the fs.

What about the number of enabled clipplanes. 1 ok? 2 ok? 3 ok?

There is a built-in constant whose name is gl_MaxClipPlanes. Its minimum value is 6.

By writing to gl_ClipVertex on NV hardware, I’m getting 6 additional DP4s in the output VP20 program assembly. I’m not sure if and how this can be optimized by the driver in case you have, say, only 2 planes enabled, or when you’re doing the standard:

gl_Position = ftransform();
gl_ClipVertex = gl_ModelViewMatrix * gl_Vertex;

Originally posted by V-man:
I think it was nVidia or jwatte who recommended to burn a texture unit and do it ourselves in the fs.
That’s a really bad idea. Conventional clip planes cuts the geometry resulting in less fragment shading work, thus increasing performance, whereas discarding fragments in the shader comes at a cost in shader length plus disables early-Z optimizations.

Originally posted by V-man:
What about the number of enabled clipplanes. 1 ok? 2 ok? 3 ok?
Minimum amount that an OpenGL implementation has to support is 6. On ATI all 6 are hardware accelerated.

I forgot to say something in my first post. What is gl_ClipVertex exactly?

In ARB_vp/NV_vp, I think this doesn’t exist.

That’s a really bad idea. Conventional clip planes cuts the geometry resulting in less fragment shading work, thus increasing performance, whereas discarding fragments in the shader comes at a cost in shader length plus disables early-Z optimizations.
I visited the spec for NV_vp again. That’s what they recommend because client clip planes are bypassed.

If we do use GLSL and write to gl_ClipVertex, what happens on ATI and NV?

I’m assuming it costs more instructions if we enable more clip planes.

On ATI it will be clipped in hardware, which creates new geometry that covers less screen space, thus reducing fragment shading. More clip planes won’t make the shader take more instructions (AFAIK, but I’m not 100% sure), but clipping isn’t entirely for free either. But it’s normally way cheaper than killing fragments. Exceptions would be some extreme high polygon cases.

That’s good to hear. On NV gpus that have VP2, they output to all the clip distance registers. Read spasi’s post.

Does ATI do something like this?

I’m going to spend time on this stuff today.
When I wrote to gl_ClipVertex and didn’t have the clip plane 0 enabled, it gave some graphical anomallies on ATI. It looked like a checkboard texture TexGened in eye-linear onto some of my objects.

That sounds like a Z-fighting issue. You will need a depth bias if you use clip planes in a multi-pass implementation unless all passes use the exact same set of clip planes.

OK, I came back to this.

I’m not doing a multipass. It was just a dumb test. Forget about it for now as there is another problem.

I have 2 versions of my shaders. Ones that write to gl_ClipVertex and ones that don’t.
It seems all the ones that write to gl_ClipVertex, the compiler says

Link successful. The GLSL vertex shader will run in software - unsupported language element used. The GLSL fragment shader will run in hardware.
Notice the word “SOFTWARE”
What’s the problem?
I didn’t run the shaders yet because there is some more coding to do. I’m assuming performance will not be terrible.

Here’s my shortest shader

//VERT
#version 110

//UNIFORMS
uniform mat4 ModelviewMatrix;

void main()
{
	gl_Position = ftransform();

	gl_ClipVertex = ModelviewMatrix * gl_Vertex;
}

//FRAG
#version 110

const vec4 White = vec4(1.0, 1.0, 1.0, 1.0);

void main()
{
	gl_FragColor = White;
}
// The GLSL vertex shader will run in software - unsupported language element used.
//
uniform mat4 ModelviewMatrix;
void main(void)
{
   gl_Position = ftransform();
   gl_ClipVertex = ModelviewMatrix * gl_Vertex;
}
// The GLSL vertex shader will run in hardware.
// User clipping will be "undefined".
//
// BUT, on ATI's implementation user clipping
// will "just work" if ftransform() is used.
//
void main(void)
{
   gl_Position = ftransform();
// gl_ClipVertex = ModelviewMatrix * gl_Vertex; // Rely on ATI's "undefined behavior" by NOT writing gl_ClipVertex
//                                              // this will result in HW vertex shader
}

Your choice.

-mr. bill

Hi mrbill,

Thanks, I wasn’t aware of that.

I did tests with both versions of my shaders and both DON’T render correctly and both perform lousy.
If I don’t write to gl_ClipVertex, it runs slightly faster, but it’s unacceptable. My scene is simple.
My fixed path runs trouble free.
I uncommented my call to glDepthRange(1.0, 1.0) and glDepthFunc(GL_ALWAYS) which I suspected as the source of the problem, but that didn’t do it.
I disabled stencil testing. I can see my geometry in the background getting rendered correctly but it’s dog slow. CPU usage is 100%
I don’t see what else it could be.

I am trying to do stencil mirrors. I have just one mirror in the scene.

The alternate approach is render to texture but directly rendering to the framebuffer is simple for me.

Have these questions been resolved at all, it now being 2006 ?

I have some volume rendering experiments I want to do and for some of it I really want the GPU to do some of the planar clipping, rather than having to make the app. do the clipping in software.

Jon

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.