View Full Version : Probable nVidia GLSL compiler bug

11-16-2011, 06:20 AM

Does the discard keyword imply a return? If the answer is yes here, I have a bug in the nVidia GLSL compiler.

It's very hard to package a small repro. This is the best I could do so far. My apologies.

The bug occurs when you replace //WORKING1 or //WORKING2 with //NOTWORKING.
It causes an application freeze, or sometimes a BSOD on XP, or a driver recovery in Windows 7.
I expect the NOTWORKING, WORKING1 and WORKING2 code to behave the same.
But somehow, 'discard' doesn't seem to do its job: execution seems to continue.

I would appreciate if somebody could have a look to check if I am not going nuts...

A few remarks:

- please ignore the crap at the beginning of main(), above the first 'z' loop. These are not the culprit here.
- also ignore the 'problem' flag and its management. I wrote this to try to detect odd code behavior.
- you know something goes wrong during execution with shader performance: discard seems to continue execution, and as a result execution is very slow in my case.

flat in int segmentsPerScanLine;
flat in int primitiveNum; // always 4096

in vec3 normalDir;
in vec3 viewDir;
flat in vec4 finalColor;

void main(void)
bool shouldDiscard = false;
bool shouldExit = false;
bool problem = false;

int bufferSize = primitiveNum; // & 0xFFFF;

int ttStartOffset = segmentsPerScanLine;

float fbufferSize = float(bufferSize);
float fbufferSizeM1 = fbufferSize - 1.0;
float xx = gl_TexCoord[0].s * fbufferSize;
float yy = gl_TexCoord[0].t * fbufferSize;
float vx = clamp(xx, 0.0, fbufferSizeM1);
float vy = clamp(yy, 0.0, fbufferSizeM1);
float vxFloating = clamp(xx, 0.0, fbufferSize);
int iVx = int(vx);
int iVy = int(vy);
int lineStartOffset = GET_TEXTURE_IVEC4(ttStartOffset + iVy).r;
int currentOffset = lineStartOffset;

int referencePixelIndex = 0;
int nn=0;
for (int z=0; z < (primitiveNum * 200); z++)
if (shouldExit)
problem = true;

ivec4 pixelGroups = GET_TEXTURE_IVEC4(ttStartOffset + currentOffset);
for (int n=0; n < 4; n++)
int pixelGroup = pixelGroups[n];
int lastPixel = (pixelGroup & 0xFFF);
if (iVx <= lastPixel)
bool canDisplayPixel;
vec4 theColor = finalColor;

if ((pixelGroup & 0x40000000) != 0 || (pixelGroup & 0x20000000) != 0) // interior pixel
canDisplayPixel = true;
else // exterior pixel
canDisplayPixel = false;

if (canDisplayPixel)
shouldExit = true;
shouldExit = true;
shouldDiscard = true;

if (shouldExit)

if (problem)
gl_FragColor = vec4(0.0, 0.0, 0.0, 1.0);
else if (shouldDiscard)
gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0);


NB: drivers tested: 285.58, 260.99. Windows XP 32-bit, GeForce GT 430 1024MB.

11-16-2011, 12:47 PM
According to my copy of the Orange book:
An implementation might or might not continue executing the shader, but it is guaranteed that there is no effect on the framebuffer.
A quick read through the GLSL spec backs this up: all that it says is that the shader outputs won't be written, but it doesn't specify whether or not the shader continues executing.