Early Z rejection on NV6800!!

hihi!
i was wondering how to u all test if early z rejection is working in a correct procedure on NV6800?? My current test algorithm doesnt seem to be working. My final result was “diminishing” and being “clipped” to nothingness…sigh

//Pass 1: Render Current Depth to a PBuffer
1)clear color to black & clear depth to 1
2)set to draw at backbuffer
3)disable depth test, set depth mask to false
4)modelview, load identity matrix, do glulookat
5)bind shader, draw a cube

//Vertex program for cube: 
VertexOut RenderDepth(VertexIn IN, uniform half4x4 modelViewProj) 
{ 
      VertexOut OUT; 
      OUT.position = mul(modelViewProj, half4(IN.position, 1.0)); 
      return OUT; 
} 
//Fragment program for cube: 
Pixel RenderDepth(Fragment IN) 
{ 
     Pixel OUT; 
     OUT.color = IN.position.z; //luminance, (z from WPOS) 
     return OUT; 
} 

Result of pass 1:
i see a white square (due to camera positioning), the surrounding pixels looks black. The results were outputted into Image Debugger.

//Pass 2: Discarding pass at framebuffer
1)clear color to black & clear depth to 1
2)set to draw at backbuffer
3)enable depth test, set depth mask to TRUE, depth func to LESS
4)set color mask to false,false,false,false
5)disable stencil test, disable alpha test
6)bind shader, draw a quad (z = -1.0f), bind render texture from pass 1

//Vertex program for quad: 
VertexOut DiscardDepth(VertexIn IN) 
{ 
      VertexOut OUT; 
      OUT.position = half4(IN.position, 1.0f); 
      return OUT; 
} 
//Fragment program for quad: 
//output Green color for visual detection 
Pixel DiscardDepth(Fragment_V3 IN, 
                 uniform samplerRECT prevDepth) 
{ 
      Pixel OUT; 
      float prev_depth = texRECT(prevDepth, IN.position.xy).x; 
      if(prev_depth != 0) 
   discard; 
      OUT.color = half4(0,1,0,1); 
      return OUT; 
} 

Result of pass 2:
If color writes were enabled instead, i could see the framebuffer having a black square(due to the discard), the surrounding pixels were green.
If color writes were disable, i would see a black screen.

After pass 2, i did this to try to see whats in the depth buffer (using The Image Debugger):
imdebugDepthf(0,0,width,height);

what i saw was a white square(due to the cube), the white areas are due to the depth having a 1, and the surrounding areas black, due to the depth being 0.

at this point, i am wondering, since there are 2 depth buffers (i think),
should i be seeing whats in the depth buffer??
or perhaps if early z is still under function, i should see all white instead??

Another interesting modification would be to set depth mask to false, or not to even bother touching the function.

If depth mask is set to false, then the depth buffer image outputted on Image Debugger would seem diminishing to nothing.

And if the depth mask function was not even called, then the depth buffer image is exactly the same as the one obtained from the very first frame. So after pass 3, i would see the cube being “clipped” by the very first depth image, when i move the camera around.

//Pass 3: Render Cube again, try to test early z culling here
1)enable depth test, set depth mask to FALSE, depth func to LESS
2)set color mask to true
3)modelview, load identity matrix, do glulookat
4)bind shader, draw cube

//Vertex program for cube: 
VertexOut RayAABBIntersect(VertexIn IN, 
                 uniform half4x4 modelViewProj) 
{ 
     VertexOut OUT; 
     OUT.position = mul(modelViewProj, half4(IN.position, 1.0f)); 
     return OUT; 
} 
//Fragment program for cube: 
//output Purple color for visual detection 
Pixel RayAABBIntersect(Fragment IN) 
{ 
      //output hit coordinates of cube as a color 
} 

Results of pass 3:
the cube looks “diminished”, kinda got “clipped” and became “irregular”.

but suppose if i disable depth test & set depth mask to false for pass 2,
i will see all white (due to depth being 1) when i do the imdebugDepthf(0,0,width,height);

the cube will look alright. but i cant tell if early z rejection has taken place, coz it did not register a speedup, its worse in fact due to the additional calculations.

anyone got test app?
thx!
Edwinz

Did i get this right? You are rendering ONE SINGLE CUBE ??
You should really use a more complex scene with a lot of overdraw, complex shaders, etc.

Jan.

hi Jan!
thx fer replying!

the RayAABB intersect shader is bad enuff for quarter of a million fragments.

thx!
Edwinz

ok i revamp the entire early z test, according to Woid’s recommendations… but still couldnt see any real difference.

here it is again:

[b]pass 1/b:generate ray directions to a RTT
pass 1 conditions:
glDisable(GL_DEPTH_TEST);
glDepthMask(false);

[b]pass 2/b: render depth only using checker board function, and drew a quad
pass 2 conditions:
glDrawBuffer(GL_BACK);
glEnable(GL_DEPTH_TEST);
glDepthFunc(GL_ALWAYS);
glDepthMask(true);
glColorMask(false,false,false,false);

//Code for checkboard function (from Woid):

float main(float3 WPos : WPOS) : COLOR 
{ 
    float2 uv = floor(WPos.xy / float2(16,16)); // you can modify size of squares here 
    float res = frac((uv.x + uv.y) / 2); 
    if(res<0.5) 
        discard; 
    return 1; 
}

[b]pass 3/b: 20 times of Ray AABB intersect, using pass 1’s RTT of ray directions, and draw a quad
pass 3 conditions:
glDepthFunc(GL_LESS);
glDepthMask(false);
glColorMask(true,true,true,true);

The checkboard function serves to provide the reason that approximately half of the ray AABB intersections should not be performed. Therefore, over an extremely expensive fragment computation, we should see some promising improvements for the run that has the render depth only pass.

Unfortunately, i am not seeing any such results yet. I ran 20 times the Ray AABB intersect, and with or without the render depth only pass, it gave 72 fps on both runs.

comments?
or anyone have test app?
thx!
Edwinz

>> … it gave 72 fps on both runs…

disable vertical synchro maybe ? :stuck_out_tongue: it looks like your monitor refresh rate …

Using discard will disable early Z.

hi all thx fer replying!
i got vsync turned off.

THX fer the hint XMas!

okay i got early z to work on my Rad9550 in D3D, will try to port it to the 6800 soon…

THX!
Edwinz