Tessellation Control Shader

From OpenGL.org
Revision as of 17:38, 20 November 2012 by Alfonse (Talk | contribs) (Overview)

Jump to: navigation, search
Tessellation Control Shader
Core in version 4.0
Core ARB extension ARB_tessellation_shader

The Tessellation Control Shader (TCS) is a Shader program written in GLSL who's primary purpose is to determine how much Tessellation is used within a rendered patch.


The Tessellation Control Shader (TCS) controls how much tessellation a particular patch gets. It's main purpose is to feed the tessellation levels to the Tessellation primitive generator stage, as well as to feed its output values to the Tessellation Evaluation Shader stage.

Execution model

The TCS execution model is different from most other shader stages; it is most similar to Compute Shaders. Unlike Geometry Shaders, where each invocation can output multiple primitives, each TCS invocation is only (in theory) responsible for producing a single vertex of output to the output patch.

For each patch provided during rendering, n​ TCS shader invocations will be processed, where n​ is the number of vertices in the output patch. So if a rendering command draws 20 patches, and each output patch has 4 vertices, there will be a total of 100 separate TCS invocations.

The different invocations that provide data to the same patch are interconnected. These invocations all share their output values. They can read output values that other invocations for the same patch have written to. But in order to do so, they must use a synchronization mechanism to ensure that all other invocations for the patch have executed at least that far.

Because of this, it is possible for TCS invocations to share data and communicate with one another.

Output patch size

The output patch size is the number of vertices in the output patch. It also determines the number of TCS invocations used to compute this patch data. The output patch size does not have to match the input patch size.

The number of vertices in the output patch is defined with an output layout qualifier:

layout(vertices = patch_size​) out;

patch_size​ must be an integral constant expression greater than zero and less than the patch limit (see below).


All inputs from vertex shaders to the TCS are aggregated into arrays, based on the size of the input patch. The effective size of these arrays is the number of input patches provided by the patch primitive. User-defined inputs can be declared as unbounded arrays:

in vec2 texCoord[];

You should not attempt to index this array past the number of patch vertices.

There are a number of built-in input variables who's contents are generated by the system:

 in int gl_PatchVerticesIn;
 in int gl_PrimitiveID;
 in int gl_InvocationID;

gl_PatchVerticesIn​ is the number of vertices in a patch. gl_PrimitiveID​ is the index of the current patch within this rendering command. gl_InvocationID​ is the index of the TCS invocation within this patch. Therefore, a TCS invocation usually writes to per-vertex output variables by using this to index them.

The TCS also takes the built-in variables output by the vertex shader:

in gl_PerVertex
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];
} gl_in[gl_MaxPatchVertices];

Note that just because gl_in​ is defined to have gl_MaxPatchVertices​ entries does not mean that you can access beyond gl_PatchVerticesIn​ and get reasonable values.

Note that every TCS invocation for an input patch has access to the same data (except for gl_InvocationID​, which will be different for each).


TCS's can have per-vertex outputs and per-patch outputs. They're more or less the same thing, except that per-vertex outputs are aggregated into arrays the size of the output patch vertex count and per-patch outputs are not.

Therefore, a user-defined per-vertex output variable would be defined as such:

out vec2 vertexTexCoord[];

The length of the array (vertexTexCoord.length()​ will always be the size of the output patch. So you don't need to restate it.

A TCS can only ever write to the per-vertex output variable that corresponds to their invocation. So writes to per-vertex outputs should be to vertexTexCoord[gl_InvocationID]​. Any expression that writes to a per-vertex output that doesn't index it with exactly "gl_InvocationID​" results in a compile-time error. This includes silly things like vertexTexCoord[gl_InvocationID - 1 + 1]​.

There is a built-in per-vertex output Interface Block for the traditional vertex values:

out gl_PerVertex
  vec4 gl_Position;
  float gl_PointSize;
  float gl_ClipDistance[];
} gl_out[];

You do not have to use this output if you do not want to.

Patch variables

Per-patch output variables are not aggregated into arrays (unless you want them to be, in which case you must specify a size). All TCS invocations for this patch see the same patch variables. They are declared with the patch​ keyword:

patch out vec4 data;

Any TCS can write to a per-patch output.

There are two special built-in patch output variables:

patch out float gl_TessLevelOuter[4];
patch out float gl_TessLevelInner[2];

These define the outer and inner tessellation levels used by the tessellation primitive generator. They define how much tessellation to apply to the patch. Their exact meaning depends on the type of patch (and other settings) defined in the Tessellation Evaluation Shader.

If multiple TCS invocations write to the same patch output, they should write the same value. This is guaranteed so long as the math and logic they use to compute the values written to patch outputs do not use gl_InvocationID​ in any way.


TCS invocations that operate on the same patch can read each others output variables, whether per-patch or per-vertex. To do so, they must first ensure that those invocations have actually written to those variables. The value of all output variables is undefined initially.

Ensuring that invocations have written to a variable requires synchronization between invocations. This is done via the barrier()​ function. When executed, it will not complete until all other TCS invocations for this patch have reached that barrier. This means that all writes have occurred by this point. However, subsequent writes to those variables may have occurred, so if you want to read those variables, make sure that another barrier()​ is issued before writing more to them. If there are no subsequent writes to those variables, then this should be fine.

The barrier()​ function has significant restrictions on where it can be placed. It must be placed:

  • Directly in the main()​ function. It cannot be in any other functions or subroutines.
  • Outside of any flow control. This includes if​, for​, switch​, and the like.
  • Before any use of return​, even a conditional one.

This ensures that every TCS invocation hits the same sequence of barrier()​ calls in the same order every time. The compiler will error if any of these restrictions are violated.


There is a maximum output patch size, defined by GL_MAX_PATCH_VERTICES; the vertices​ output qualifier must be less than this value. The minimum required limit is 32.

There are other limitations on output size, however. The number of components for active per-vertex output variables may not exceed GL_MAX_TESS_CONTROL_OUTPUT_COMPONENTS. The minimum required limit is 128.

The number of components for active per-patch output variables may not exceed GL_MAX_TESS_PATCH_COMPONENTS. The minimum required limit is 120. Note that the gl_TessLevelOuter​ and gl_TessLevelInner​ outputs do not count against this limit (but other built-in outputs do if you use them.

There is a limit on the total number of components that can go into an output patch. To compute the total number of components, multiply the number of active per-vertex components by the number of output vertices, then add the number of active per-patch components. This number may not exceed GL_MAX_TESS_CONTROL_TOTAL_OUTPUT_COMPONENTS. The minimum required limit is 4096, which is not quite enough to use a 32-vertex patch with 128 per-vertex components and 120 per-patch components. But it's still a lot.