I’m doing GPGPU (using GLSL, FBO, ARB_texture_rectangle and NV_float_buffer) stuff for my thesis at the moment and made some unpleasant experiences.
I use a GeForce FX 5600 Go and a GeForce 6800 GT as my dev / test machines.
I think I found two bugs:
gl_TexCoord[0] = gl_MultiTexCoord0.xyxy + vec4(-0.5,-0.5,0.5,0.5);
should be equivalent to?
gl_TexCoord[0].x = gl_MultiTexCoord0.x - 0.5;
gl_TexCoord[0].y = gl_MultiTexCoord0.y - 0.5;
gl_TexCoord[0].z = gl_MultiTexCoord0.x + 0.5;
gl_TexCoord[0].w = gl_MultiTexCoord0.y + 0.5;
But after comparing the compiled shader assembler code (using nvemulate) of both snippets, the output interpolants were initialized differently from the constant register.
So I sticked with snippet 2
Using the GeForce FX 5600 Go, my algorithms run with the same results as my CPU reference implementation.
Then I switched to the GeForce 6800 GT and got shocked that the results differed!
So I took nvemulate and disabled the NV4X GLSL features… and it run as it should have run. So I think that the GLSL output for the NV4X profiles has some bug.
As I have my deadline nearing quite fast, I don’t have the time to locate the bugs more ecactly, so I’m using the workarounds instead.
But I would be willing to provide a test app.