ATI 11.11 bug with stencil routing k-buffer

I’m finding that there’s a bug in the AMD driver when implementing stencil routed k-buffer. The problem I’m experiencing is simply, I render 4 quads to the multisampled framebuffer with all the appropriate states set for stencil routing, but on the AMD card, it fills all depth samples with the same depth. Namely the depth value of the first fragment to touch that particular pixel. The attached screenshots show the simple demo where I render the 4 quads into each separate sample in the framebuffer. It’s an orthographic projection and they’re all the same size. On NVIDIA hardware, the resolve pass produces different depth values for each quad. But on AMD hardware, the resolve pass produces the exact same depth values, and furthermore, the depth values are for the first quad rendered to the screen. It doesn’t matter what order they were rendered in, can be front to back or back to front, it doesn’t make a difference.

AMD Radeon 5850 with Catalyst 11.11

NVIDIA Geforce 480 with 285.62

Anybody else have this problem?

Did you turn on sample shading? Because without that, OpenGL is free to execute your fragment shader only once for a pixel’s worth of samples.

Or, to put it another way, if you weren’t using sample shading, then you were relying on undefined behavior and happened to get lucky with NVIDIA’s drivers.

That’s what the stencil test is used for in the stencil routing k-buffer approach.

It turns out it’s sample masking that’s broken on AMD cards. The following code should write the 2 quads to samples 0 and 1. On NVIDIA hardware both the color and depth are correct. But on AMD hardware, the depth value of the first quad gets written to all 4 samples.

I suppose the other issue could be that the AMD driver always fetches sample 0 when texelFetch is called in the shader.

glEnable(GL_MULTISAMPLE);
glEnable(GL_SAMPLE_MASK);

glSampleMaski(0, 0x01);
glBegin(GL_QUADS);
glColor4f(1, 0, 0, 1);
glVertex3f(-0.5, -0.5, -1.0f);
glVertex3f(0.5, -0.5, -1.0f);
glVertex3f(0.5, 0.5, -1.0f);
glVertex3f(-0.5, 0.5, -1.0f);
glEnd();

glSampleMaski(0, 0x02);
glBegin(GL_QUADS);
glColor4f(0, 1, 0, 1);
glVertex3f(-0.5, -0.5, -2.0f);
glVertex3f(0.5, -0.5, -2.0f);
glVertex3f(0.5, 0.5, -2.0f);
glVertex3f(-0.5, 0.5, -2.0f);
glEnd();

glDisable(GL_MULTISAMPLE);
glDisable(GL_SAMPLE_MASK);

AMD

NVIDIA

This could be a driver bug, but we’d need to see more of the application. The problem could be in the generation of the data or in the resolve. Can you send us the app? You can PM me here or email my.name <at> amd.com.

My understanding is that he’s rendering 4 quads on top of each other, and expecting each quad to go into a different slot (MSAA buffer sample) by using sample masking. Sample shading shouldn’t matter here - it’s OK for the fragment shader to run only once, as he’s generating 4 separate samples per pixel, one per quad.

It looks like the color samples are being stored and can then be fetched as expected, but depth samples are either not getting stored correctly, or there is a bug when fetching depth samples in a shader.

I ran into a very similar problem where texelFetch was broken for MSAA depth textures on AMD, and submitted a repro for it… I think it ended up getting fixed on some AMD hardware, but not all (5xxx series was still broken last I checked, some older cards that I tested worked).

Thanks Graham, I’ve emailed you at your amd address.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.