Too much out variables?

hello,

recently I started to develop something that will involve quite many outputs, but that’s just background… The real problem is however tightly connected.

I wanted to see, how much variables I can output from vertex shader, so I set up transform feedback and started to create arrays like this.

flat out unsigned int numN[11]

and substitued N for 1, 2, 3… I’m using Nvidia 540M, and GLSL outputs error if I exceed N=11 - not enough space to initialize. That’s fine, because I’m notified about that. But there’s something else - it looks like memory gets corrupted if I use more than 32 integers, that means N>=3. I’ll post here some results of transform feedback for various N.

N = 2
1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2

N = 3
1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 1

N = 11
1 1 1 1 1 1 1 1 1 1 1 4 7 10 2 2 2 2 2 2 2 2 5 8 11 3 3 3 3 3 3 3 1 1 1 1 1 1 1 1 1 1 1 4 7 10 2 2 2 2 2 2 2 2 5 8 11 3
3 3 3 3 3 3 1 1 1 1 1 1 1 1 1 1 1 4 7 10 2 2 2 2 2 2 2 2 5 8 11 3 3 3 3 3 3 3 1 1 1 1 1 1 1 1 1 1 1 4 7 10 2 2 2 2 2 2 2
 2 5 8 11

The tokens that goes into transform feedback setup are as following

num1[0], num1[1], ...,num1[10], num2[1], ..., num11[10]

(well this is for last case… N=2 uses num1 and num2 only, etc.). In shader, values of numN at each index of array are set to N

num1[0] = 1, num1[1]=1, ..., num2[0]=2, ...

As you can see in N=3, there should be 3 not 1 as last number. As this is the 33th number, it really looks like I can transform feedback only up to 32 integers (as N=2 has no problems and N=11 is total mess). This, however, should result in error not a memory corruption, because if I wouldn’t first test it, I would have headache by now.

Can anyone explain? Is there something like MAX_TRANSFORM_FEEDBACK_VARYINGS and is it even related to transform feedback or is it problem of GLSL in first place.

Thanks

Hi,
Could u show us the relevant gl calls and shaders?

I wanted to see, how much variables I can output from vertex shader, so I set up transform feedback and started to create arrays like this.

There are several MAX enums for this purpose:

  GL_MAX_VERTEX_OUTPUT_COMPONENTS
  GL_MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS
  GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS
  GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS

For my Quadro 4000 card, they’re 128, 128, 4, and 4. So, I can pass up to 128 floats between shaders, collect up to 128 floats in a single TF buffer, or collect up to 4 components each in 4 different buffers. A 32b integer would be the same as a 32b float for the purposes of these enums. What are the values for your card?

Same as yours,

MAX VERT OUT COMP: 128
MAX TF INTERL COMP: 128
MAX TF SEPAR COMP: 4
MAX TF SEPAR ATRIB: 4

I believe it’s drivers defined so the values will be same for all cards supporting them.

As I use interleaved mode, it would correspond to my observations, however only when 1 component = 1 byte and not 1 variable. I’ll test some other configs(array of 128, 128 vars alone,…)

Perhaps it’s padding single floats and ints to vec4 size when passing varyings; that might explain it (32x4 = 128).

that would explain if I were using variables one by one, but why would they pad arrays? Anyway I got same behavior with single array - less than 33 elements works fine, then the array becomes invalid (more and more with bigger size). What’s more, I get memory error on calling glFinish() if I exceed something like 80 elements, though no GLSL error this time.

The code itself is, I believe, irrelevant - I’ve been using same loading iterface for several programs without problems and shader is just few lines:

#version 150

flat out unsigned int num1[33];

void main(){
	for(unsigned int i = unsigned int(0); i < unsigned int(33); i++)
		num1[i] = unsigned int(i);
}

with this, you should experience problem reading last value(33th).

Practically, if I would like to limit user artificially to these 32 components in shader, does openGL provide a way to check how much of them are there, or would I need to check that for myself.

that would explain if I were using variables one by one, but why would they pad arrays?

Hardware register size of 128b, perhaps? I’m just guessing here, based on the results you’re getting.

you mean that each shader unit would have register of 128 bytes where it can write? would explain that error using single array. however checking for those errors seems to be buggy/missing.

Anyway, I packed executable(and dependencies) so you can review it’s behavior. http://www.box.com/s/36x5crkzqvhk8nki5qcz .You can’t edit shader, because it would require change to source code, but I’m interested if you can get 32 as last number or you get 0.

My Quadro is currently installed on my Linux system, and my Windows system has an AMD 6950 installed. The Cat 11.11 driver reports “Error: Too many outputs specified from the vertex shader” for your program. Its stats are similar to the Quadro and your card, Voc=128, TFi=64, TFb=4, TFc=4, though a bit lighter on the number components it can capture in an interleaved TF buffer.

Thanks for trying. Looks like you were true about assumption that any output is alligned to it’s vec4 version (vec4, uvec4, ivec4,…), because I can get much more values out by using these than by using arrays of “scalar” versions. I think it’s a bug in drivers, that it computes with size of what has been defined but works with aligned versions. Anyway most of time people are working with much smaller amount of outputs and it’s just matter of stopping compilation and outputing error, but it can be annoying if one isn’t aware of it. I think I’ll discuss this more on Nvidia forums, thank you once more.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.