PDA

View Full Version : Maximum array size in geometry shader

Rodan
09-11-2007, 09:43 AM
Hi,
I'm tessellating a single triangle recursively with a geometry shader, and things are fine up to recursion level n (simulated with iteration obviously), but when I go to n+1, some triangles are missing. I have checked the basic things, like setting maximum output vertices to be large enough, prior to linking, and now I'm looking into other limitations I may not have known about. I'm also outputting triangle strips, but they're really just single triangles, with an EndPrimitive() after every 3 EmitVertex() calls. Anyway, on to my question...

I declare 2 local arrays in main(), of sizes 341 and 153, which hold ivec3's and vec4's respectively. These are worst case sizes, and at level n+1 I'm doing the worst case, so I thought maybe there was some limit (e.g. 256) for array sizes I finally hit, but for the life of me I can't find the GL variable to query to determine this. I've been through the glew header looking for hints, and the archives here. Anyone know? Thanks.

pudman
09-11-2007, 04:06 PM
Originally posted by Rodan:
I declare 2 local arrays in main(), of sizes 341 and 153, which hold ivec3's and vec4's respectively.I'm not super familiar with shader programming but isn't that a lot of data to declare on the stack?

Does anyone know if recursion incurs any of the same loop limitations?

@Rodan:
Have you tried implemented this portion of shader code as a loop instead of recursion?

oc2k1
09-11-2007, 07:15 PM
It's not recommend to use a recursive algorithm with many temporary variables. Often it's better to tessellate it with two loops (that will be unrolled by the compiler)

In a shader it's not possible allocate memory from a heap or store variables on a stack. The only native datatype are temporary registers. If to many are used, the shader will be slow, because the number parallel running shaders will be limited by the GPU registers.

Rodan
09-12-2007, 07:34 AM
Originally posted by oc2k1:
It's not recommend to use a recursive algorithm with many temporary variables. Often it's better to tessellate it with two loops (that will be unrolled by the compiler)This is what I am doing, it's what I meant when I said I was simulating recursion with iteration (loops). Sorry, I should have been more clear. At this point, it looks like I'm being limited to 128 vertices capable of being emitted, even though querying max vertices says 1024. When I look at the assembly source via NVemulate, it says VERTICES_OUT=128. However, I think this is a bug, since no matter what value I set max vertices to, it says 128, and when I set the value below 128, I see as much in my viewer. So clearly, I'm having an effect, it just seems to be clamped at 128 for some reason. At this point, I think it's a driver bug, but of course these things are always complicated so I'll have to do more tests to be sure. Finally, I copy and paste the *exact* same code into a version which doesn't use geometry shaders, and it works fine, so I am pretty sure it's the limit on the EmitVertex() calls that's my problem.

jgennis
09-12-2007, 04:46 PM
How many components are you trying to output per vertex? You may be exceeding the MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS_EXT limit.

Rodan
09-13-2007, 08:10 AM
Originally posted by jgennis:
How many components are you trying to output per vertex? You may be exceeding the MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS_EXT limit. I was wondering about that at one point, but I'm currently only writing to gl_Position.