Rendering Pipeline Overview

Revision as of 13:25, 8 September 2012 by Alfonse (Talk | contribs) (Pipeline: Better Overfiew.)

Jump to: navigation, search

The Rendering Pipeline is the sequence of steps that OpenGL takes when rendering objects. This overview will provide a high-level description of the steps in the pipeline.


Rendering Pipeline Flowchart
Diagram of the Rendering Pipeline

The OpenGL rendering pipeline works in the following order:

  1. Prepare vertex array data, and then render it
  2. Vertex processing via Vertex Shader. Each vertex in the stream is processed in turn into an output vertex.
  3. Optional primitive tessellation stages.
  4. Primitive assembly and optional Geometry Shader primitive processing. The output is a sequence of primitives.
  5. Primitive Clipping, the Perspective Divide, and the Viewport Transform to window space.
  6. Scan conversion and primitive parameter interpolation.
  7. The data for each fragment is processed with a Fragment Shader. Each fragment generates a number of outputs.
  8. Per-sample operations:
    1. Blending
    2. Depth Test
    3. Stencil Test
    4. Write Mask
    5. Scissor Test
    6. Logical Operation

Vertex Specification

The process of vertex specification is where the application sets up an ordered list of vertices to send to the pipeline. These vertices define the boundaries of a primitive.

Primitives are basic drawing shapes, like triangles, lines, and points. Exactly how the list of vertices is interpreted as primitives is handled via a later stage.

This part of the pipeline deals with a number of objects like Vertex Array Objects and Vertex Buffer Objects. Vertex Array Objects define what data each vertex has, while Vertex Buffer Objects store the actual vertex data itself.

A vertex's data is a series of attributes. Each attribute is a small set of data that the next stage will do computations on. While a set of attributes do specify a vertex, there is nothing that says that part of a vertex's attribute set needs to be a position or normal. Attribute data is entirely arbitrary; the only meaning assigned to any of it happens in the vertex processing stage.

Vertex Rendering

Once the vertex data is properly specified, it is then rendered with a rendering command as a Primitive.

Vertex Processing

Each vertex pulled from the source data must be processed. This is the responsibility of the vertex shader. It receives the attribute inputs from the previous step and converts each incoming vertex into a single outgoing vertex based on an arbitrary, user-defined program.

Unlike the input vertex information, the output vertex data has a few requirements. There is a position value which must be filled in by the vertex shader in order to emit a valid vertex.

One limitation on vertex processing is that each input vertex must map to a specific output vertex. And because vertex shader invocations cannot share state between them, the input attributes to output vertex data mapping is 1:1. That is, if you feed the exact same attributes to the same vertex shader in the same primitive, you will get the same output vertex data. This gives implementations the right to optimize vertex processing; if they can detect that they're about to process a previously processed vertex, they can use the previously processed data stored in a post-transform cache. Thus they do not have to run the vertex processing on that data again.


After vertices are processed by the vertex shader, they are converted into primitives. These primitives can optionally be tessellated into multiple primitives.

Primitive Assembly

Primitive assembly is the process of collecting a run of vertex data output from the vertex shader and composing it into a viable primitive. The type of primitive the user rendered with determines how this process works.

The output of this process is an ordered sequence of simple primitives (lines, points, or triangles). If the input is a triangle strip primitive containing 12 vertices, for example, the output of this process will be 10 triangles.

Tessellation Shader

Tessellation Shader
Core in version 4.5
Core since version 4.0
Core ARB extension ARB_tessellation_shader

Primitives can be tessellated using two shader stages and a fixed-function tessellator between them.

Geometry Shader

In addition to the usual primitive assembly step, you can instead use a geometry shader. This is a user-defined program that processes each incoming primitive, returning zero or more output primitives.

The input primitives for geometry shaders are the output primitives from primitive assembly. So if you send a triangle strip as a single primitive, what the geometry shader will see is a series of triangles.

However, there are a number of input primitive types that are defined specifically for geometry shaders. These adjacency primitives give GS's a larger view of the primitives; they provide vertex information for adjacent vertices.

The output of a GS is zero or more simple primitives, much like the output of primitive assembly. The GS is able to remove primitives, or tessellate them by outputting many primitives for a single input. The GS can also tinker with the vertex values themselves, either doing some of the work for the vertex shader, or just to interpolate the values when tessellating them. Geometry shaders can even convert primitives to different types; input point primitives can become triangles, or lines can become points.

Transform Feedback

The outputs of the geometry shader or primitive assembly are written to a series of buffer objects that have been setup for this purpose. This is called transform feedback mode; it allows the user to do transform data via vertex and geometry shaders, then hold on to that data for use later.

The data output into the transform feedback buffer is the data from each primitive emitted by this step.

By discarding the raterization results, the pipeline can effectively be ended here. This allows for transform feedback to be the only output produced from rendering.

Clipping and Culling

The primitives are then clipped and appropriate culling is done.

Clipping means that primitives that lie on the boundary between the inside of the viewing volume and the outside are split into several primitives. Also, the vertex shader can define a number of clipping panes in the protected space to apply to the primitives. These clip planes cause additional clipping for any primitives that cross them.

Face culling for triangles also happens at this stage. An implementation is also quite free to immediately cull any primitive that is not within the viewing region, or is completely within the boundary of a clipping plane.


Primitives that reach this stage are then rasterized in the order in which they were given. The result of rasterizing a primitive is a sequence of fragments.

A fragment is a set of state that is used to compute the final data for a pixel (or sample if multisampling is enabled) in the output framebuffer. The state for a fragment includes its position in screen-space, the sample coverage if multisampling is enabled, and a list of arbitrary data that was output from the previous vertex or geometry shader.

This last set of data is computed by interpolating between the data values in the vertices for the fragment. The style of interpolation is defined by the shader that outputed those values.

Fragment Processing

The data for each fragment from the rasterization stage is processed by a fragment shader. The output from a fragment shader is a list of colors for each of the color buffers being written to, a depth value, and a stencil value. Fragment shaders are not able to set the stencil data for a fragment, but they do have control over the color and depth values.

Per-Sample Operations

The fragment data output from the fragment processor is then passed through a sequence of steps.

The first step is the various culling tests. The stencil test, if any, is performed; if it fails, then the fragment is culled and not added to the framebuffer. The depth test, if any, is performed; if it fails, then the fragment is culled and not added to the framebuffer. If any of these tests fail, the fragment is culled and not added to the framebuffer.

Note: If your fragment shader does not write to the depth value, and thus use the normally computed depth value, implementations are able to perform an optimization called Early Depth Test. It does the depth (and stencil) test before the fragment is processed. Thus if the fragment is culled, the fragment shader will not have to run at all.

After this, blending happens. For each fragment color value, there is a specific blending operation between it and the color already in the framebuffer at that location.

Lastly, the fragment data is written to the framebuffer. Masking operations allow the user to prevent writes to certain values. Color, depth, and stencil writes can be masked on and off; individual color channels can be masked as well.