Future Ideas For GLslang

Hi all!

I just want to post here some future ideas for GLslang. Tell me what do you think about it!

-) Uniform Programs
The idea of these programs is that they are executed when this uniform is passed to the GL. For example the light position of the GL state is multiplied by the current Modelview Matrix. Such operations should be available for all uniforms.

-) Matrix Stacks for generic matrices
Just like the Modelview or Texture matrix stack but for generic matrices used in Shaders.

-) Generic Buffers
To write own values from a Fragment Shader in special floating point buffers. I don’t know if that is included in the superbuffer extension…

-) Frame Buffer Access
Used for own stencil, depth, aplha, … testing and other nice operations. I know that this was originally defined in GLslang but I don’t understand why it was deleted.

-) General Pipelined Processors
There are two types of processors now: Vertex Shader and Fragment Shader. If GLslang supports general processors there’s no need to define new special processors like triangulation processors (displacement mapping, …).
My idea is the following: a general processor has sepecific input and output (e.g. for a triangulation processor: input: a surface, output: a couple of triangles). The outputs are then sent to the next general processor and so on. Each processor should work on just one input item and should have the ability to load general data, for example a texture.

greets
Corrail

Just like the Modelview or Texture matrix stack but for generic matrices used in Shaders.

You’re a big boy now; you can do it yourself

I don’t understand why it was deleted.

I think 3 or 4 issues in the glslang spec debate the merrits of having or not having this functionality. If you want an explaination as to why it was removed, look there.

If GLslang supports general processors there’s no need to define new special processors like triangulation processors (displacement mapping, …).

What if hardware manufacturers don’t want to go that route? What if having specialized processors is the most efficient way to go for them?

You can’t have a language going off on it’s own little path. Glslang must follow, to some degree, the path of hardware. It cannot dictate this path, because such a thing could easily dampen hardware evolution if the glslang path is the wrong one. The API should evolve to fit the hardware, and possible urge the hardware in obvious directions. It should not try to dictate long-term goals to the hardware.

Also, how do you define inputs and outputs? Yes, the output of a primitive processor would be the input of the vertex shader, but the output of the vertex shader is not the same as the input to the fragment shader. For each triangle, there are many calls to fragment shaders. This is determined by the scan conversion and interpolator unit. This unit should not be programmable, since this operation needs to be fast (and there isn’t really too much to do here). So, it isn’t trivial to design what the “inputs” and “outputs” for these processors are.

GLSL looks pretty mature now, actually. It’s certainly more versatile than DirectX’s HLSL or Assembly, and it doesn’t force the shader programmer to select a shader version to work with.

The next step forward will have to be made by the hardware. I’ve heard that ATI is bringing out an R420 chip soon (don’t know what nVidia’s answer to that is) but there doesn’t seem to be any concrete info. Some sources say it’ll have PS3 and VS3 support…
What I’d like to see is hardware implementation of some of the higher-level functions, eg. trig functions. The modern VPU has often got more floating-point grunt than the host machine’s CPU, so perhaps a wider range of math functions should be added?

What kind of trig functions are you looking for exactly? NVIDIA GPUs since NV3x have included single-cycle sin and cos instructions in the vertex and pixel shader.
Most other functions can be built up from the basic instructions, or baked into look-up table textures. It’s like the old CISC vs. RISC debate on CPUs. If anything GPUs are moving in a more RISC direction, i.e. simpler instruction sets but higher performance.

Generic buffers - you can already write out values to floating point pbuffers. GLSL will be extended to support MRT (multiple draw buffers).

Uniform Programs - Microsoft’s HLSL includes a virtual machine that can execute shader code on the CPU. You can use this to pre-calculate stuff like the light position example given above (MS calls these “preshaders”). It’s possible GLSL could be extended to do this also.

Originally posted by Descenterace:
[b]GLSL looks pretty mature now, actually. It’s certainly more versatile than DirectX’s HLSL or Assembly, and it doesn’t force the shader programmer to select a shader version to work with.

The next step forward will have to be made by the hardware. I’ve heard that ATI is bringing out an R420 chip soon (don’t know what nVidia’s answer to that is) but there doesn’t seem to be any concrete info. Some sources say it’ll have PS3 and VS3 support…
What I’d like to see is hardware implementation of some of the higher-level functions, eg. trig functions. The modern VPU has often got more floating-point grunt than the host machine’s CPU, so perhaps a wider range of math functions should be added?[/b]

What if hardware manufacturers don’t want to go that route? What if having specialized processors is the most efficient way to go for them?

You can’t have a language going off on it’s own little path. Glslang must follow, to some degree, the path of hardware. It cannot dictate this path, because such a thing could easily dampen hardware evolution if the glslang path is the wrong one. The API should evolve to fit the hardware, and possible urge the hardware in obvious directions. It should not try to dictate long-term goals to the hardware.

Allright, didn’t had that in mind…

Also, how do you define inputs and outputs?

My idea was that you can define your inputs on your own. With structs or something like that.

This is determined by the scan conversion and interpolator unit. This unit should not be programmable, since this operation needs to be fast (and there isn’t really too much to do here).

Yes of course these unit have to be fast but if you should be able to design the inputs and outpus on your own there MUST be a progammable unit between 2 programmable processors.
I also thought of fixed processors which have defined in- and outputs and can be used instead of programmable units. e.g. a interpolator unit. You can take advantage of a full programmable unit (which might be slow) or you can take a non-programmable unit which might be faster.

Generic buffers - you can already write out values to floating point pbuffers. GLSL will be extended to support MRT (multiple draw buffers).

Yes I know that I’m able to use pbuffers but I’d like to write more data than only in one buffer. Hope MRTs are supported soon (I think ATI R3xx HW is able to that now…).

Uniform Programs - Microsoft’s HLSL includes a virtual machine that can execute shader code on the CPU. You can use this to pre-calculate stuff like the light position example given above (MS calls these “preshaders”). It’s possible GLSL could be extended to do this also.

I didn’t know that this feature is available in DX… :slight_smile:
It doesn’t matter if it is supported in HW or done in SW but I think it would be a nice feature.

Originally posted by simongreen:
[b]Uniform Programs - Microsoft’s HLSL includes a virtual machine that can execute shader code on the CPU. You can use this to pre-calculate stuff like the light position example given above (MS calls these “preshaders”). It’s possible GLSL could be extended to do this also.

[/b]

I don’t know MS HLSL, but I think this is an important issue: good work sharing between CPU and GPU. I don’t want to put all work to the GPU but have an easy to handle sharing between GPU and CPU. And for a data-/shader driven architecture this should be handled by the shading language and api.

Couldn’t the uniform programs be handled automatically by the shading compiler ? Do the compilers already factor calculations on uniforms out on the CPU or are they calculated on the GPU for every vertex/fragment ?

For example if you have something like:

uniform vec4 quaternion[2];
uniform float time;

vec4 qslerp(vec4 q1, vec4 q2, float t) {
float phi = acos(dot(q1,q2));
return sin(phi*(1.0-t))/sin(phi)q1 + sin(phit)/sin(phi) *q2;
}

coid main () {
[…]
vec4 q = qslerp(quaternion[0], quaternion[1],time);
[…]
}

Is qslerp calculated on the GPU or on the CPU for current ATI/NVIDIA glsland compilers ?
To do the calculation on the CPU could be done transparent to the user and without changes to glslang language/api. I would say this is/should be a normal optimization of the compiler (maybe with preprocessor flags for choosing optimization strategy).

Another thing would be the complete execution of shader programs or parts/functions of it on the CPU by user request. This would be really usefull for handling transformation feedback like collision response and physical simulation on the CPU. In my opinion therefore indeed some additions for glslang api is needed.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.