Deprecation of vertex shaders and addition of shading layers in upcoming OpenGL LM

Hello,

I have read interesing article about new geometry shaders in OpenGL at http://appsrv.cse.cuhk.edu.hk/~ymxie/Geometry_Shader/
and I found two interesting ideas.
It seems that vertex shaders are just special case of geometry shaders that have input and output primitivies set to points, and can emit one primitivie per invocation.
So why we still need them, should’t they be considered in compatibility layer?
The answer may be: it is handy to set up the scene first and then generate more geometry details.
OK, then what about shading layers, where you can specify a stack of shaders that are executed in order as successive passes?

There is my conception:

                  geometry                            fragmet
		  shading                             shading
                   stack                               stack
               +---+---+---+                       +---+---+---+ 
+----------+   |   |   |   |   +---------------+   |   |   |   |         +----+
| geometry |-->| 1 | 2 |...|-->| rasterization |-->| 1 | 2 |...|-->...-->| FB |
+----------+   |   |   |   |   +---------------+   |   |   |   |         +----+
               +---+---+---+                       +---+---+---+
               \_____ _____/                       \_____ _____/
                     V                                   V
       GL_MAX_GEOMETRY_SHADING_LAYERS      GL_MAX_FRAGMENT_SHADING_LAYERS

A driver can handle GL_MAX_GEOMETRY_SHADING_LAYERS and GL_MAX_FRAGMENT_SHADING_LAYERS per stack. glUseProgramObject() must be extednded to support layers, and every program object must be attached to one layer (similar to FBO color attachments).
If a layer has no attached program it will be ommited during execution. If there is no attached program to any layer OpenGL pipeline will use fixed functionality.
Program objects cannot mix geometry and fragment shaders (restriction), because every layer in stack operates on the same type of data (geometry primitivies or fragments) and must procuce it for next layer, thus it is handy to implement shader stacks as objects (see below).

Execution of programs in stack must be performed in order form first to last. Each program in lower layer must be executed completely before higher layer may be computed. So there need to be at least two buffers (input and output) for temporary computations between programs. The result of last executed program in stack is used in successive operations of OpenGL pipeline.

Stacks of shaders could be also implemented as objects, so instead of binding shader programs to fixed stacks by glUseProgramObject() we may create a stack object, attach programs to it, and use by glUseGeometryStack(stack_handler) and glUseFragmentStack(stack_handler) like we use glUseProgramObject() today (stack_handler==0 means fixed functionality).

In addition, there could be also possibility to streamout data from stack directly to VRAM like in DirectX10 by setting streamout attribute in particular stack objects (that attribute could be directly a handler of vertex buffer for geometry shading stack or texture for fragment shading stack - if it is set to 0 data will be propagated to next stage in graphic pipeline).
The advantage here is that data are transmited after all stack operations w/o any API involvement that must be performed every iteration between shaders in DirectX10.

This schema has five big advantages:

  • single rendering pass gives more abilities to compute output rendering

  • reduces number of additional passes that need to be performed, so it reduces API overhead (and it is one of new OpenGL LM goals)

  • simplifies OpenGL interface: there is no more need for separate vertex shader functionality (and it is another goal of OpenGL LM)

  • since we operate on full geometry shaders from the begining we can operate not only on defined points (like in vertex shaders) but also on full primitivies during geometry setup

  • possibility to compute different/unrelated operations parallely in future if stack of shaders are objects, and there will be support for multipath rendering with synchonization (i.e. when there are 2 or more graphics cards or parallelism of GPUs will be exposed to software).

Use cases in geometry shaders are pretty obvious, but what with fragmet shaders?
There is also need for that, for example I was writing a complete hardware MPEG decoder based on OpenGL. The main disadvantage, that degrades performance and quality of rendering was lack of ability to quickly perform precomputations that are used for next fragment computation. Fast IDCT can be performed in two separate passes, that could be achieved very fast by joinig them in shader stack.
Also other multipass operations like blurring (one blur shader on few layers) or shadow blending comes to mind, and could be done in single pass with that approach.

Please consider these propositions in upcoming “Mt. Evans” OpenGL release. I think it can be huge improvement, especially compared to actual DirectX10 specification.

Cheers
Wojtek

So why we still need them, should’t they be considered in compatibility layer?
No.

1: Vertex shader hardware exists. And it’s not going anywhere in the near future.

2: The purpose of geometry shaders is different from that of vertex shaders. There’s no guarantee that geometry shaders coopting vertex shader functionality will be possible, let alone as performant as vertex shaders.

Please consider these propositions in upcoming “Mt. Evans” OpenGL release.
You can forget that. Mt Evans won’t be deprecating any functionality; just adding to Longs Peak. So vertex shaders aren’t going away anytime soon.

OK, maybe it is really too futuristic for now, but let me explain my conception more, so it could be implememnted as yet another OpenGL extension in case new hardware could use it effectivly together with current OpenGL implementation.

It could be done like that:

The shading stack object (SSO) has two containers: geometry stack and fragment stack, and three attributes: geometry streamout handler, fragment streamout handler and a three-state switch between them and normal pipeline output.

+---------------------------------+
| +---+---+---+    +---+---+---+  |
| |   |   |   |    |   |   |   |  |
| | 1 | 2 |...|    | 1 | 2 |...|  |
| |   |   |   |    |   |   |   |  |
| +---+---+---+    +---+---+---+  |
| geometry stack   fragment stack |
|                                 |
| [geom_handler]   [frag_handler] |
|        [pipeline_switch]        |
+---------------------------------+

Normal program objects that are used by glUseProgramObject() consist of vertex and fragment shaders. If program object uses only geometry shaders or fragment shaders it can be considered to use only with shading stack objects.

SSO could be used instead of normal normal shaders by binding them by, say glUseShadingStack(stack_handler) call. Whenever those two functions will be called they replace previous binding.

If a SSO is active and complete (see below) it performs all computations in geometry stack and according to pipeline switch puts result to streamout geometry object (say vertex buffer) and/or to normal pipeline rasterization. Similary fragment stack can put outgoing result to streamout fragment object (say texture) and/or next pipeline tests and blending for framebuffer output.

The SSO API could be like that:

  • glCreateShadingStack()
  • glUseShadingStack(stack_handler)
  • glAttachProgramObject(stack_handler, stack_selector, layer_number, program_object) (program_object==0 - if we want to detach object from selected layer)
  • glCheckShadingStackStatus() - check completeness like in FBO

Commpetness and relations between attibutes of shaders:
Every program bind to SSO must be properly compiled.
Unified attributes behave just like in normal shaders. Attributes and varyings (that are used as normal attributes among geometry shaders) are inherited from lower layers to upper in every pass. There can’t be name (varying and attribute of the same name) or type conflict between them.
Attribute that is unchanged (or even undeclared) in one layer is copied without modification to another. If one layer changed it it has its new value in next layer. Varying attributes inherited from all layers are interpolated for fragment stack.
Similarly varying attributes among fragment shaders are inherited and values computed in last fragment program are used for framebuffer computations.

Since every input buffer from previous pass is read only in new pass it would be also possible to fetch fragment values and attributes from other fragments like in normal table.

That are my first conceptions - maybe they will be interesting to some from Big Vendors, maybe not. Time will show.

P.S. I have been writing very fast and I’m not native english speaker, so sorry for all errors.
I have no more time for now.

Ok, i read it(multiple times), but i still didn’t fully understand it.
But anyway here it goes

  1. I can envision photon tracing hardware before i can envision this, GPUs are fast because there are a number of set and very dedicated steps in the rendering pipeline, one could say that it all flows trough it smoothly.
    Now what i think your suggesting is that one should be able to connect the “plumbing” in any way you like.
    It’s a ok idea i guess, but there is a reason why i call a plumber every time i need to change anything, because he can do it way better than me.
    It’s the same with the graphics pipeline, the hardware vendors do know what’s the best way of connecting the pipe is, and both them and i know that it’s better if there are no joint’s in the system.

  2. True, vertex shaders and geometry shaders could have been made into one shader, but not all systems have geometry processing functions and not all programs need the geo shaders, so as it is now these shaders preform two related but still distinct functions and it will remain that way until something drastically changes (like some kind of true raytracing card that does not need any vertex and geometry transformation).

  3. think “backwards compatible”.

Originally posted by Wojciech Milkowski:
It seems that vertex shaders are just special case of geometry shaders that have input and output primitivies set to points, and can emit one primitivie per invocation.
So why we still need them, should’t they be considered in compatibility layer?

Vertex shaders don’t emit primitives. Vertex shaders process and emit vertices. Geometry shaders process and emit primitives. A better name would have been “primitive shader”, but that’s about as interesting as “pixel shader” vs. “fragment shader”. Strictly speaking a geometry shader can do everything a vertex shader can. However, keeping vertex shading separately makes sense from a performance point of view. Everything that’s vertex specific goes into the vertex shader, and what’s primitive specific goes into the geometry shader. This allows you to reuse vertex shader results for vertices that are shared between many primitives. Pretty much the same reason we have glDrawElements() and not just the glDrawArrays() call. If we only had glDrawArrays(), then we geometry shaders would be all that’s needed, but as long as you have vertex sharing a separate vertex shader makes sense.