Tessellation Shaders specification discussion

Hi All,

Usually I don’t complain about the specification. It is as it is, and we can do nothing about it. But when I have to explain it to somebody else then I come to some difficulties being unable to defend some points in the specification. The main topic of today’s agenda is tessellation shaders. I would like to hear your opinions why the specification for the tessellation shaders is written like it is.

1. The problem of indexing output array gl_out[] in the tessellation control shader (TCS).

The tessellation control shader is invoked for each vertex of the output patch! Each invocation can read the attributes of any vertex in the input or output patches, but can only write per-vertex attributes for the corresponding output patch vertex. … Tessellation control shaders must use the input variable gl_InvocationID as the vertex number index when writing to per-vertex output variables.

So, we all have to write gl_out[gl_InvocationID].gl_Position = … all the time, instead of just gl_out.gl_Position = … Why the spec allows indexing but just for a single value? In my opinion gl_out should not be seen as an array from within TCS.

2. The problem of gl_out[] array length.

The intrinsically declared tessellation control output array gl_out will also be sized by any output layout declaration. Hence, the expression gl_out.length() will return the output patch vertex count specified in a previous output layout qualifier. For outputs declared without an array size, including intrinsically declared outputs (i.e., gl_out), a layout must be must be declared before any use of the method length() or other array use requires its size be known. It is a compile-time error if the output patch vertex count specified in an output layout qualifier does not match the array size specified in any output variable declaration in the same shader.

Why anyone should call function to read a constant value defined in the same program unit? Removing the array will also remove the problem with the length.

3. The problem of defining tessellation primitive generator (TPG) functioning by the subsequent stage.

A layout qualifier in the tessellation evaluation shader (TES) defines: primitive mode, spacing, direction and point-mode. All this is required by TPG to do its job. Why it is not defined in the TCS, where it logically should be?

4. The problem of writing patch out variables.

While tessellation control shader invocations may read any per-vertex and per-patch output variable and write any per-patch output variable, reading or writing output variables also written by other invocations has ordering hazards discussed below.

It is overseen in many tutorials that only a single invocation of TCS should write patch out variables. Maybe GLSL compiler should cast an exception if those writings are not enclosed in some if(gl_InvocationID==…)-statement.

There are probably other problems in the TS spec, but those are the most apparent ones. If you have an explanation for any of them, please share with me. :slight_smile:

So, we all have to write gl_out[gl_InvocationID].gl_Position = … all the time, instead of just gl_out.gl_Position = … Why the spec allows indexing but just for a single value? In my opinion gl_out should not be seen as an array from within TCS.

Because “Each invocation can read the attributes of any vertex in the input or output patches”. You can read what other invocations have written (given an appropriate barrier). To do what you suggest would require having more variables defined: the output that you write to, and the outputs that others may have written to.

It’s just easier and more obvious what’s going on this way. You can read other people’s outputs, but you can only write to your own.

Why anyone should call function to read a constant value defined in the same program unit? Removing the array will also remove the problem with the length.

I don’t understand your question. The length is defined in a layout setting at global scope. It’s not very difficult to consider the possibility of a program that uses a #define for that, where they stitch their shader together by #defining the length outside of the shader itself. In that case, the shader code really doesn’t know the length.

There’s no “problem with the length.” Like any other array in GLSL, you can access it’s length with .length(). The difference is where you set the length of the array; it is done globally instead of when you declare the output array variable.

A layout qualifier in the tessellation evaluation shader (TES) defines: primitive mode, spacing, direction and point-mode. All this is required by TPG to do its job. Why it is not defined in the TCS, where it logically should be?

Why should it be in the control shader? Consider the following.

Let’s say that I want to do some bezier patch tessellation. So my control shader outputs 16 positions to interpolate between. However, what do I draw with this? I could draw triangles, lines, points, etc. If I want to tessellate with line rendering or triangle rendering, then I currently need two evaluation shaders. But they can all come from the same control shader.

Your way would require that I still have two evaluation shaders (let’s say the interpolation logic changes a bit for lines vs. triangles) and two control shaders. It creates tight coupling between stages when a looser coupling is possible.

It is overseen in many tutorials that only a single invocation of TCS should write patch out variables.

OpenGL doesn’t exist to service the illiterate. Tessellation is not simple. Tessellation is not for beginners. It’s not for “my first OpenGL program.”

Tessellation is defined to be powerful. But, as the saying goes, with great power comes great responsibility. I’m all for making an API simpler, but only so long as the utility of the API is retained.

The ability to cross-talk between control shader invocations is useful power. And I won’t give that up because a few people online couldn’t bother to read the specification and understand what it’s saying. And adding “outputs that you write to” in addition to “outputs other guys wrote to” would only make the API more cumbersome and confusing. It might make tutorials easier (since they would only talk about “outputs that you write to”, but it would make the API for actual people harder to work with.

Thank you Alfonse for the clarification, and sorry for this late response.

Well, I knew each invocation can read gl_out values from the others, but since it requires barrier to work well (and hence decrease performance), I thought if would be cleaner for the specification and easier to implement without allowing that. But OK, if it is a common case.

The remark about conditional writing values into patch out variables was just my appeal to all those who write about TS to pay attention. Not just tutorials, even some examples in certain books don’t treat it well. Maybe it is OK to write the same values from multiple invocations, but who guaranties that the writing is an atomic operation or the values are calculated on the same way in all invocations.

I’m trying to clarify some aspects of TS to myself either, so please forgive me for thinking out loud. :slight_smile:

Thanks again!

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.