Geometry Shader

From OpenGL.org
Revision as of 19:04, 27 August 2012 by Alfonse (Talk | contribs) (Overview.)

Jump to: navigation, search
Geometry Shader
Core in version 3.2
ARB extension ARB_geometry_shader4

A Geometry Shader (GS) is a Shader program written in GLSL that governs the processing of primitives. It happens after primitive assembly, as an additional optional step in that part of the pipeline. A GS can create new primitives, unlike vertex shaders, which are limited to a 1:1 input to output ratio. A GS can also do layered rendering; this means that the GS can specifically say that a primitive is to be rendered to a particular layer of the framebuffer.

Unlike other shader stages, a geometry shader is optional and does not have to be used.

Note: While geometry shaders have had previous extensions like GL_EXT_geometry_shader4 and GL_ARB_geometry_shader4, these extensions expose the API and GLSL functionality in very different ways from the core feature. This page describes only the core feature.

Overview

Geometry shaders sit between vertex shaders and the rasterizer. Vertex shaders have a 1:1 ratio of vertices input to vertices output. Each vertex shader invocation gets one vertex and writes one vertex.

Geometry shader invocations take a single Primitive as input and may output zero or more primitives. There are implementation-defined limits on how many primitives can be generated from a single GS invocation.

While the GS can be used to amplify geometry, implementing a form of tessellation, this is not the primary use for the feature. The general uses for GS's are:

  • Layered rendering: taking one primitive and rendering it to multiple images without having to change bound rendertargets and so forth.
  • Transform Feedback: This is often employed for doing computational tasks on the GPU (obviously pre-Compute Shader).

In OpenGL 4.0, GS's gained two new features: the ability to write to multiple output streams. This is used exclusively with transform feedback, such that different feedback buffer sets can get different transform feedback data.

The other feature was GS instancing, which allows multiple invocations to operate over the same input primitive. This makes layered rendering easier to implement.

Primitive in/out specification

Each geometry shader is designed to accept a specific Primitive type as input and to output a specific primitive type. The accepted input primitive type is defined in the shader:

layout(input_primitive​) in;

The input_primitive​ type must match the primitive type used with the rendering command that renders with this shader program. The valid values for input_primitive​, along with the valid OpenGL primitive types, are:

GS input OpenGL primitives vertex count
points GL_POINTS 1
lines GL_LINES, GL_LINE_STRIP, GL_LINE_LIST 2
line_adjacency GL_LINES_ADJACENCY, GL_LINE_STRIP_ADJACENCY 4
triangles GL_TRIANGLES, GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN 3
triangles_adjacency GL_TRIANGLES_ADJACENCY, GL_TRIANGLE_STRIP_ADJACENCY 6

The vertex count is the number of vertices that the GS receives per-input primitive.

The output primitive type is defined as follows:

layout(output_primitive​, max_vertex = vert_count​) out;

The output_primitive​ may be one of the following:

  • points
  • line_strip
  • triangle_strip

These work exactly the same way their counterpart OpenGL rendering modes do. To output individual triangles or lines, simply use EndPrimitive​ (see below) between each triangle/line.

There must be a max_vertex​ declaration for the output. The number must be a compile-time constant, and it defines the maximum number of vertices that will be written by a single invocation of the GS. It may be no larger than the implementation-defined limit of MAX_GEOMETRY_OUTPUT_VERTICES. The minimum value for this limit is 256. See the limitations below.

Instancing

GS Instancing
Core in version 4.0
Core ARB extension ARB_gpu_shader5

The GS can also be instanced. This causes the GS to execute multiple times for the same primitive. This is useful for layered rendering and outputs to multiple streams (see below).

To use instancing, there must be an input layout qualifier:

layout(invocations = num_instances​) in;

The value of num_instances​ is a compile-time constant, and must not be larger than MAX_GEOMETRY_SHADER_INVOCATIONS (the minimum implementations will allow is 32). The built-in value gl_InvocationID​ specifies the particular instance of this shader.

The output primitives from instances are ordered by the gl_InvocationID​. So 2 primitives written from 3 instances will create a primitive stream of: (prim0, inst0), (prim0, inst1), (prim0, inst2), (prim1, inst0), ...

Input values

Output values

Layered rendering

Output streams

Output limitations

There are two competing limitations on the output of a geometry shader:

  1. The maximum number of vertices that a single invocation of a GS can output.
  2. The total maximum number of output components that a single invocation of a GS can output.

The first limit, defined by GL_MAX_GEOMETRY_OUTPUT_VERTICES, is the maximum number that can be provided to the max_vertices​ output layout qualifier. No single geometry shader invocation can exceed this number.

The other limit, defined by GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS is, in layman's terms, the total amount of stuff that a single GS invocation can write. It is the total number of output components that a single GS invocation can write to. This is different from GL_MAX_GEOMETRY_OUTPUT_COMPONENTS (the maximum allowed number of components in out​ variables). The total output component is the total number of components + vertices that can be written.

For example, if the total output component count is 1024 (the smallest maximum value from GL 4.3), and the output stream writes to 12 components, the total number of vertices that can be written is 1024/12 = 85. This is the absolute hard limit to the number of vertices that can be written; even if GL_MAX_GEOMETRY_OUTPUT_VERTICES is larger than 85, because each vertex takes up 12 components, the true maximum that this particular geometry shader can write is 85 vertices.