How many varyings am really using?

NVidia Quadro 3400 FX
driver = 71.89

I have GL_MAX_VARYING_FLOATS_ARB = 32. For float, vec2, vec3, or vec4, I would expect to use 1, 2, 3, or 4 varying floats, respectively. However, it appears that I use up 4 varying floats by using any of these data types. I am running out of varying floats way too fast.

I haven’t found any profiling utility that tells me if this is correct or not. Is this a known bug or am I just insane?

It is the expected behavior of most hardware to count any non-matrix value as 4 floats, and each row of a matrix takes 4 floats. Though, technically, if they tried to be the slightest bit smart or clever about it, it wouldn’t have to…

Any idea why they do it that way? Or is there any literature on it anywhere?

Any idea why they do it that way?
my guess it’s to keep things optimal for parallel computation, avoid swizzles, and perhaps to simplify the compiler at this stage in the game.

The solution is to bunch them together :

varying vec3 normal;
varying float distance;

becomes

varying vec4 normal_distance;

Did some nvidia document discourage this? Maybe it’s in the gpu programming guide.

The solution is to bunch them together
No, the actual solution is for hardware vendors to get off their butts and do this for us. We shouldn’t have to do it. It’s a high-level language; it can abstract stuff like that away easily enough.

I’m sort of surprised hardware vendors don’t do this for us already. All the hardware I have here at work seems to have 32 varying floats, which really isn’t enough to be wasting any. It’s easy to use them up with a moderately complex shader.

Originally posted by V-man:
[b]The solution is to bunch them together :

varying vec3 normal;
varying float distance;

becomes

varying vec4 normal_distance;

Did some nvidia document discourage this? Maybe it’s in the gpu programming guide.[/b]
Well it is good solution if you want save one float varying, but it is bad if you want to encode four vec3’s in 3*vec4.

yooyo

Of course, I agree with you Korval.
I had to use this solution myself a few times to avoid multipass.

Tip: if you are using texcoord for your 2D texture, you might want to do this

gl_TexCoord[0].xy = gl_MultiTexCoord0.xy;

and the zw is free for a vec2!

No, the actual solution is for hardware vendors to get off their butts and do this for us. We shouldn’t have to do it. It’s a high-level language; it can abstract stuff like that away easily enough.
I believe this is a vendor specific problem, 3Dlabs Realizm cards advertise 64 scalar varyings and these can be any combination of floats, vec2, vec4s, etc.

The solution is to bunch them together :
NO! Shaders should be programmed intuitively, with best practices when possible. If you are writing shader(s) to work on across graphics cards from multiple vendors, workarounds such as this should not be generically used. Your application can query for the vendor string and send the correct shader for one and the shader containing a workaround for the other. Doing things like packing multiple variables into a single vec4 could hinder performance, and it potentially prevents the compiler from optimizing the shader to its fullest extent.

gl_TexCoord[0].xy = gl_MultiTexCoord0.xy;
Again, it is much better to write your code intuitively:

varying vec2 textureCoord;
. . . 
textureCoord = gl_MultiTexCoord0.xy;

If you’re writing software with shaders to work across platforms, write them in a portable manner, if a particular vendor requires a workaround - provide a shader for this purpose, for that vendor’s product, and send a shader written with best practices to those cards that support it.

Agreed. Putting this workaround in my shaders proved to be fairly painful. I hope other vendors follow your lead and give us all the varyings we thought we paid for soon :slight_smile:

I believe this is a vendor specific problem, 3Dlabs Realizm cards advertise 64 scalar varyings and these can be any combination of floats, vec2, vec4s, etc.
Yes, but nobody cares about 3DLabs hardware :wink:

If you’re writing software with shaders to work across platforms, write them in a portable manner, if a particular vendor requires a workaround - provide a shader for this purpose, for that vendor’s product, and send a shader written with best practices to those cards that support it.
And how are you supposed to know? Keep adding varyings until it breaks, and then present a substitute shader? Test on every hardware before releasing code? Bah. I prefer having implementers fix their compilers than having us do it ourselves, but I also happen to live in a world where implementers aren’t willing to fix this problem. So, we do what we must for the greatest good.

I like my code to be bullet-proof and not to die or fail or fall to some wierd path just because of a driver screwup.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.