Hi,
Does the discard keyword imply a return? If the answer is yes here, I have a bug in the nVidia GLSL compiler.
It’s very hard to package a small repro. This is the best I could do so far. My apologies.
The bug occurs when you replace //WORKING1 or //WORKING2 with //NOTWORKING.
It causes an application freeze, or sometimes a BSOD on XP, or a driver recovery in Windows 7.
I expect the NOTWORKING, WORKING1 and WORKING2 code to behave the same.
But somehow, ‘discard’ doesn’t seem to do its job: execution seems to continue.
I would appreciate if somebody could have a look to check if I am not going nuts…
A few remarks:
- please ignore the crap at the beginning of main(), above the first ‘z’ loop. These are not the culprit here.
- also ignore the ‘problem’ flag and its management. I wrote this to try to detect odd code behavior.
- you know something goes wrong during execution with shader performance: discard seems to continue execution, and as a result execution is very slow in my case.
flat in int segmentsPerScanLine;
flat in int primitiveNum; // always 4096
in vec3 normalDir;
in vec3 viewDir;
flat in vec4 finalColor;
void main(void)
{
bool shouldDiscard = false;
bool shouldExit = false;
bool problem = false;
int bufferSize = primitiveNum; // & 0xFFFF;
int ttStartOffset = segmentsPerScanLine;
float fbufferSize = float(bufferSize);
float fbufferSizeM1 = fbufferSize - 1.0;
float xx = gl_TexCoord[0].s * fbufferSize;
float yy = gl_TexCoord[0].t * fbufferSize;
float vx = clamp(xx, 0.0, fbufferSizeM1);
float vy = clamp(yy, 0.0, fbufferSizeM1);
float vxFloating = clamp(xx, 0.0, fbufferSize);
int iVx = int(vx);
int iVy = int(vy);
int lineStartOffset = GET_TEXTURE_IVEC4(ttStartOffset + iVy).r;
int currentOffset = lineStartOffset;
int referencePixelIndex = 0;
int nn=0;
for (int z=0; z < (primitiveNum * 200); z++)
{
if (shouldExit)
problem = true;
ivec4 pixelGroups = GET_TEXTURE_IVEC4(ttStartOffset + currentOffset);
for (int n=0; n < 4; n++)
{
int pixelGroup = pixelGroups[n];
int lastPixel = (pixelGroup & 0xFFF);
if (iVx <= lastPixel)
{
bool canDisplayPixel;
vec4 theColor = finalColor;
if ((pixelGroup & 0x40000000) != 0 || (pixelGroup & 0x20000000) != 0) // interior pixel
{
canDisplayPixel = true;
}
else // exterior pixel
{
canDisplayPixel = false;
}
if (canDisplayPixel)
{
shouldExit = true;
break;
}
else
{
//WORKING1
shouldExit = true;
shouldDiscard = true;
break;
//WORKING2
discard;
return;
//NOTWORKING
//discard;
}
}
}
currentOffset++;
if (shouldExit)
break;
}
if (problem)
gl_FragColor = vec4(0.0, 0.0, 0.0, 1.0);
else if (shouldDiscard)
discard;
else
gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0);
}
Thanks,
Fred
NB: drivers tested: 285.58, 260.99. Windows XP 32-bit, GeForce GT 430 1024MB.