Gf 8600, 8800

Is there some kind of limit in the drivers?
I believe it should support 60 varying floats which is 15 vec4.

It seems to be limited to 8
Can someone test on their system?

I’ve found that I’m able to use far more than 8 vec4s with GLSL on my 8800GTS, but it appears that Cg (in Linux anyways) does not deactivate unused built-in varyings, and so I end up having to bind things to texture coordinate varyings, etc.

I remember to run into bug in ATI drivers. I was using 8 varying variables + gl_FragCoord. That means 9 varying variables were actually used and shader failed to compile without explaining why.
I simply collapsed some vec2’s together and floats with vec3’s and it worked. Maybe it’s the same on GeForce GPU’s - you have to manually pack float varying variables into vec4 varying variables because compiler isn’t smart enough for that.

As for the limit itself - I don’t know if GeForce 8 series spport more than 8 varying variables. On delphi3d.net it does say 60 floats, which would mean 15 vec4’s.

It’s kind of weird. gl_TexCoord[0] and my own varyings, some vec3, some vec4, that’s 9 varying.
It gives a linker error. I’m going to assume this is a freak accident and put this on the backburner for now.

BTW, another guy had the same issue :
http://www.gamedev.net/community/forums/topic.asp?topic_id=478423

Overlap between fixed function (FF) varying bindings and user supplied bindings is not a problem if done right.

Simply don’t use the FF bindings and roll all of your own.

IN SHADER

attribute vec4 varA;
attribute vec4 varB;

THEN IN PROGRAM

glBindAttribLocation(prog,0,“varA”)
glBindAttribLocation(prog,1,“varB”)

and such to force naming to bind to specific varying locations

then later, just use

glEnableVertexAttribArrayARB(n);
glBindBuffer(GL_ARRAY_BUFFER,…);
glVertexAttribPointerARB(n,…);

Don’t forget that the sum of generic and conventional attributs must be less than MAX_VERTEX_ATTRIBS, and you gotta leave room for a generic mat4 (or 4 contiguous vec4s)… like I did.

Those are attributes.
I’m talking about varying aka interpolators.

Also, I did not want to confuse anyone but I’ll throw this in:
I downloaded the latest Cg compiler, I compiled my GLSL to the vp40/fp40 profile and the compiler is having the same issue.

So it looks like even their Cg compiler is effected or it’s confused.

Did you try compiling to the gp4vp/gp4fp profile?

N.

On my system, with an 8800 GTS running Linux with driver 100.14.19, these GLSL shaders worked:

Vertex shader:


#version 120
#extension EXT_gpu_shader4 : require

varying vec3 test[20];

void main()
{
    for (int i = 0; i < 20; i++) test[i] = vec3(1.0, 1.0, 1.0);

    gl_Position = ftransform();
}

Fragment shader:


#version 120
#extension EXT_gpu_shader4 : require

varying vec3 test[20];

void main()
{
    vec3 color = vec3(0.0);

    for (int i = 0; i < 20; i++) color += test[i] / 20.0;

    gl_FragColor = vec4(color, 1.0);
}

When I try 21 varyings, my system hardlocks. This is odd though - I thought gl_Position counted against me as a varying? The compiler must not share it with the fragment shader though since I’m not using it. Or maybe my test is flawed somehow even though it did crash with 21 vec3 varyings (63 floats).

That sounds about right. Apparently gl_Position does not count against the varying variables count. I’m already able to pass the position along with 8 vec4 texture coordinates to the fragment shader on a Geforce 7600 GO which only supports 32 varying floats. My guess is that gl_Position follows a different path because, AFAIK, it’s the only variable that is required to be written to in the vertex shader.

N.

With the 169.25 drivers I get 15 varying vec4s (60 floats), in addition to gl_Position.

But when mixing built-ins with my own varyings I get some strange readings.

Personally I’m sticking with my own varyings; the built-ins are heading south pretty soon anyway.

No, what do they do?

That sounds about right. Apparently gl_Position does not count against the varying variables count. I’m already able to pass the position along with 8 vec4 texture coordinates to the fragment shader on a Geforce 7600 GO which only supports 32 varying floats. My guess is that gl_Position follows a different path because, AFAIK, it’s the only variable that is required to be written to in the vertex shader.

N. [/QUOTE]

The spec says what is a varying and what is not. You have to actually use a varying in the vertex and fragment shader for it to count as a varying, else an interpolator is not used.

What about 169.09 on Vista 64 bit. If you have XP, how about on XP?
gl_Position? What does your vs/fs look like?

Good to know. I’m beginning to hate all the built in stuff of GLSL.

Dunno about 169.09.

For the moment I’m on Vista 32 exclusively (my XP/Linux box was killed in a bizarre gardening accident).

The shader code is pretty much what HexCat posted, only I used vec4.

No, what do they do? [/QUOTE]

They compile the code for the Geforce 8 architecture. e.g. NV_fragment_program4 for gp4fp instead of NV_fragment_program2 for fp40. Here’s the profile list.

N.

You’re right, actually using the position in the fragment shader (next to the 8 other vec4 of course) invalidated the shader. But how does the fragment shader know the depth of the fragment it is writing if it does not use an interpolator for the position attribute?

N.

Oops, sorry about that, perhaps next time I should read better before posting…

Wouldn’t you be register starved long before you reached the limit of varyings? Register starved not in the traditional CPU sense, but that register usage is limiting the amount of threading on the GeForce 8 series hurting performance?

You’re right, actually using the position in the fragment shader (next to the 8 other vec4 of course) invalidated the shader. But how does the fragment shader know the depth of the fragment it is writing if it does not use an interpolator for the position attribute?

N. [/QUOTE]

It doesn’t count as consuming an interpolator.
Probably if you actually put in a line like
varying vec4 position;
position = ftransform();

then it’s going to consume an interpolator.