Difference between revisions of "GLSL : common mistakes"

From OpenGL.org
Jump to: navigation, search
m (NVIDIA and Types)
(glUseProgram)
(21 intermediate revisions by 8 users not shown)
Line 1: Line 1:
The following article discusses common mistakes made in the OpenGL Shading Language, GLSL.
+
The following article discusses common mistakes made in the [[OpenGL Shading Language]], GLSL.
  
 
__TOC__
 
__TOC__
  
== Use the Swizzle ==
+
== Uniforms ==
  
Swizzle masks are essentially free in hardware. Use them where possible.
+
=== How to use glUniform ===
 +
If you look at all the {{apifunc|glUniform|*v}} functions, there is a parameter called count.
  
gl_TexCoord[0].x = gl_MultiTexCoord0.x;
+
What's wrong with this code? Would it cause a crash?
gl_TexCoord[0].y = gl_MultiTexCoord0.y;
+
  
This code can be simplified to:
+
  //Vertex Shader
 +
  uniform vec4 LightPosition;
  
gl_TexCoord[0].xy = gl_MultiTexCoord0.xy;
+
  //In your C++ code
 +
  float light[4];
 +
  //Fill in `light` with data.
 +
  glUniform4fv(MyShader, 4, light);
  
Drivers may detect and optimize this kind of thing, but they may not. It is best to do it for them when you can.
+
The problem is that for count, you set it to 4 while it should be 1 because you are sending '''1''' {{code|vec4}} to the shader. The count is the number of that type (4f, which corresponds to {{code|vec4}}) that you are setting.
  
== Get MAD ==
+
Consider this:
  
MAD is short for multiply, then add. It is generally assumed that MAD operations are "single cycle", or at least faster than the alternative.
+
  //Vertex Shader
 +
  uniform vec2 Exponents[5];
  
vec4 result1 = (value / 2.0) + 1.0;
+
  //In your C++ code
vec4 result2 = (value / 2.0) - 1.0;
+
  float Exponents[10];
vec4 result3 = (value / -2.0) + 1.0;
+
  glUniform2fv(MyShader, 5, Exponents);
  
A stupid compiler may directly use these as written: a divide, then add. That might cost 2 or more cycles. Below is GLSL code that converts to a single MAD instruction (for each line of code of course).
+
This is correct. The length of the array is 5, which is what we tell {{apifunc|glUniform|2fv}}.
  
vec4 result1 = (value * 0.5) + 1.0;
+
=== glUniform doesn't work ===
vec4 result2 = (value * 0.5) - 1.0;
+
vec4 result3 = (value * -0.5) + 1.0;
+
  
These is much more likely to convert into MAD operations.
+
You probably did not bind the correct shader first. Call {{apifunc|glUseProgram|(myprogram)}} first.
  
One expression might be better than the other.
+
=== glGetUniformLocation and glGetActiveUniform ===
 +
Although not strictly a mistake, some wonder why {{apifunc|glGetUniformLocation}} returns -1. If there is a uniform that you are not using, the driver will optimize your uniform out. Drivers are really good at optimizing code. If you are using your uniform, but none of the values computed from that uniform contribute to any output from the shader (directly or indirectly), the uniform will usually be optimized out.
  
result = 0.5 * (1.0 + variable);
+
Typically this is not a problem, since if you pass -1 instead of a valid uniform location to the {{apifunc|glUniform}} calls, they will quietly do nothing anyway. But you will also get -1 if you misspell the variable name to {{apifunc|glGetUniformLocation}}, so keep this in mind.
  
This may be converted into an add followed by a multiply. It can be expressed in a way that more explicitly allows for a MAD operation:
+
=== glUseProgram ===
 +
When should you call {{apifunc|glUseProgram}}?
  
result = 0.5 + 0.5 * variable;
+
{{apifunc|glUseProgram}} needs to be called before you setup a uniform (unless you have GL 4.1 or {{extref|separate_shader_objects}}, and can use {{apifunc|glProgramUniform}}). There are several versions of the {{apifunc|glUniform}} function depending if your variable is a single float, vec2, vec3, vec4, a matrix, etc. Notice that the {{apifunc|glUniform}} functions do not take the program ID (your shader) as a parameter.
  
The compiler may be able to optimize this automatically, but it may not. Best to be careful.
+
Getting the location of a uniform, such as from {{apifunc|glGetUniformLocation}}, does not require calling {{apifunc|glUseProgram}}. {{apifunc|glGetUniformLocation}} already takes the program to get the location from.
  
=== Assignment with MAD ===
+
{{apifunc|glUseProgram}} is needed for rendering. You must use the program you intend to render with before issuing a [[Vertex Rendering|rendering call]].
Assume that you want to set the output value ALPHA to 1.0. Here is one method :
+
  
  myOutputColor.xyz = myColor.xyz;
+
=== Uniform Names across shader stages ===
  myOutputColor.w = 1.0;
+
  gl_FragColor = myOutputColor;
+
  
The above code can be 2 or 3 move instructions, depending on the compiler and the GPU's capabilities. Newer GPUs can handle setting different parts of <code>gl_FragColor</code>, but older ones can't, which means they need to use a temporary to build the final color and set it with a 3rd move instruction.
+
It is legal to have the same uniform defined in different shader stage.
  
You can use a MAD instruction to set all the fields at once:
+
When you call {{apifunc|glGetUniformLocation}}, it will return one location. When you update the uniform with a call to {{apifunc|glUniform}}, the driver takes care of sending the value for each stage (vertex shader, geometry shader, fragment shader).
  
  const vec2 constantList = vec2(1.0, 0.0);
+
This is because a GLSL program contains all of the shader stages at once. Programs do not consider uniforms in a vertex shader to be different from uniforms in a fragment shader.
  gl_FragColor = mycolor.xyzw * constantList.xxxy + constantList.yyyx;
+
 
+
This does it all with one MAD operation, assuming that the building of the constant is compiled directly into the executable.  
+
 
+
== Fast Built-ins ==
+
 
+
There are a number of built-in functions that are quite fast, if not "single-cycle" (to the extent that this means something for various different hardware).
+
 
+
=== Linear Interpolation ===
+
 
+
Let's say we want to linearly interpolate between two values, based on some factor:
+
 
+
vec3 colorRGB_0, colorRGB_1;
+
float alpha;
+
resultRGB = colorRGB_0 * (1.0 - alpha) + colorRGB_1 * alpha;
+
 
+
This can be converted to the following for MAD purposes.
+
 
+
resultRGB = colorRGB_0  + alpha * (colorRGB_1 - colorRGB_0);
+
 
+
GLSL provides the <code>mix</code> function. This function should be used where possible:
+
 
+
resultRGB = mix(colorRGB_0, colorRGB_1, alpha);
+
 
+
=== Dot products ===
+
 
+
It is reasonable to assume that dot product operations, despite the complexity of them, will be fast operations (possibly single-cycle). Given that knowledge, the following code can be optimized:
+
 
+
  vec3 fvalue1;
+
  result1 = fvalue1.x + fvalue1.y + fvalue1.z;
+
  vec4 fvalue2;
+
  result2 = fvalue2.x + fvalue2.y + fvalue2.z + fvalue2.w;
+
 
+
This is essentially a lot of additions. Using a simple constant and the dot-product operator, we can have this:
+
 
+
  const vec4 AllOnes = vec4(1.0);
+
  vec3 fvalue1;
+
  result1 = dot(fvalue1, AllOnes.xyz);
+
  vec4 fvalue2;
+
  result2 = dot(fvalue2, AllOnes);
+
 
+
This performs the computation all at once.
+
  
 
== Miscellaneous ==
 
== Miscellaneous ==
Line 105: Line 65:
  
 
=== Binding A Texture ===
 
=== Binding A Texture ===
When you compile and link your GLSL shader, the next step is to get uniform locations for your samplers (I'm talking about texture samplers) and setup the samplers. Some people do this:
+
{{main|GLSL_Sampler#Binding_textures_to_samplers}}
 
+
glUniform1i(location, textureID)
+
 
+
You can't send a GL texture ID as your sampler. A sampler should be from 0 to the max number of texture image units. Once you compile and link your shader, make sure that you setup all the samplers by calling (assuming of course your samplers are named Texture0, Texture1 and Texture2)
+
 
+
location=glGetUniformLocation(shaderProgram, "Texture0");
+
glUniform1i(location, 0);
+
location=glGetUniformLocation(shaderProgram, "Texture1");
+
glUniform1i(location, 1);
+
location=glGetUniformLocation(shaderProgram, "Texture2");
+
glUniform1i(location, 2);
+
 
+
To bind a texture, always use <code>glBindTexture</code>.
+
 
+
glActiveTexture(GL_TEXTURE0);
+
glBindTexture(GL_TEXTURE_2D, textureID[0]);
+
glActiveTexture(GL_TEXTURE1);
+
glBindTexture(GL_TEXTURE_2D, textureID[1]);
+
glActiveTexture(GL_TEXTURE2);
+
glBindTexture(GL_TEXTURE_2D, textureID[2]);
+
 
+
Alternatively:
+
 
+
for(i=0; i<3; i++)
+
{
+
  glActiveTexture(GL_TEXTURE0+i);
+
  glBindTexture(GL_TEXTURE_2D, textureID[i]);
+
}
+
 
+
If you don't set the samplers properly, you might get a link failure that says something to the effect of:
+
 
+
Output from shader Fragment shader(s) linked, vertex shader(s) linked.
+
Validation failed - samplers of different types are bound to the same texture image unit.
+
  
 
=== NVIDIA and Types ===
 
=== NVIDIA and Types ===
Line 146: Line 73:
 
  float myvalue = 0;
 
  float myvalue = 0;
  
This is not legal according to the GLSL specification 1.10, due to the inability to automatically convert from integers (numbers without decimals) to floats (numbers with decimals). Use 0.0 instead. With GLSL 1.20 and above, it is legal because it will be converted to a float.
+
The above is not legal according to the GLSL specification 1.10, due to the inability to automatically convert from integers (numbers without decimals) to floats (numbers with decimals). Use 0.0 instead. With GLSL 1.20 and above, it is legal because it will be converted to a float.
 +
 
 +
float myvalue1 = 0.5f;
 +
float myvalue2 = 0.5F;
 +
 
 +
The above is not legal according to the GLSL specification 1.10. With GLSL 1.20, it becomes legal.
  
 
  float texel = texture2D(tex, texcoord);
 
  float texel = texture2D(tex, texcoord);
Line 157: Line 89:
 
=== Functions inputs and outputs ===
 
=== Functions inputs and outputs ===
  
Functions should look like this
+
Functions parameters must be declared with the {{code|in}}, {{code|out}}, or {{code|inout}} qualifiers.
vec4 myfunction(inout float value1, in vec3 value2, in vec4 value3)
+
 
instead of
+
<source lang="glsl">
vec4 myfunction(float value1, vec3 value2, vec4 value3)
+
//Correct
 +
vec4 myfunction(inout float value1, in vec3 value2, in vec4 value3);
 +
 
 +
//Wrong
 +
vec4 myfunction(float value1, vec3 value2, vec4 value3);
 +
</source>
  
 
=== Not Used ===
 
=== Not Used ===
Line 173: Line 110:
 
Also, do not use gl_TexCoord[0]. Define your own varying and call it VaryingTexCoord0.
 
Also, do not use gl_TexCoord[0]. Define your own varying and call it VaryingTexCoord0.
  
=== How to use glUniform ===
 
If you look at all the glUniform functions (glUniform1fv, glUniform2fv, glUniform3fv, glUniform4fv, glUniform1iv, glUniform2iv, glUniform3iv, glUniform4iv, glUniformMatrix4fv and the many others), there is a parameter called count.<br>
 
<br>
 
What's wrong with this code? Would it cause a crash?<br>
 
  //Vertex Shader
 
  uniform vec4 LightPosition;
 
  //In your C++ code
 
  float light[4];
 
  glUniform4fv(MyShader, 4, light);
 
  
The problem is that for count, you set it to 4 while it should be 1 because you are sending 1 vec4 to the shader.<br>
+
=== Sampling and Rendering to the Same Texture ===
What's wrong with this code? Would it cause a crash?<br>
+
{{main|Framebuffer_Object#Feedback_Loops}}
  //Vertex Shader
+
  uniform vec2 Exponents;
+
  //In your C++ code
+
  float Exponents[2];
+
  glUniform2fv(MyShader, 2, Exponents);
+
  
The problem is that for count, you set it to 2 while it should be 1 because you are sending 1 vec2 to the shader.<br>
 
What's wrong with this code? Would it cause a crash?<br>
 
  //Vertex Shader
 
  uniform vec2 Exponents[5];
 
  //In your C++ code
 
  float Exponents[10];
 
  glUniform2fv(MyShader, 5, Exponents);
 
 
There is nothing wrong with it. We want to send 5 values of vec2.
 
 
=== glUniform doesn't work ===
 
 
You probably did not bind the correct shader first. Call glUseProgram(myprogram) first.
 
 
=== glUniform causes a slow down ===
 
 
All the glUniform calls are relatively fast except that it has been reported that on some nVidia drivers, when certain values are sent to the shader, the driver recompiles and reoptimizes your shader. This is obviously a problem for games. Values are 0.0, 0.5, 1.0. There is no solution other than to avoid those exact numbers. Has nVidia solved this issue in recent drivers? Unknown.
 
 
=== glGetUniformLocation ===
 
Although not strictly a mistake, some wonder why glGetUniformLocation returns -1. If there is a uniform that you are not using, the driver will optimize your uniform out. Drivers are really good at optimizing code. If you are using your uniform and it is clear that the uniform will never effect the output, the uniform will get optimized out.
 
 
=== Uniform Names in VS, GS and FS ===
 
 
So what happens if you have the same exact uniform name in both the vertex shader and geometry shader and fragment shader?
 
 
Yes, it is legal to have the same uniform name in all shaders.
 
 
When you call glGetUniformLocation, it will return one location. When you update the uniform with a call to glUniform, the driver takes care of sending the value for each stage (vertex shader, geometry shader, fragment shader).
 
 
This is because a GLSL shader is considered monolithic : the VS, GS and FS is considered as 1 shader.
 
 
Keep in mind that this applies to all uniforms : float, vec2, vec3, vec4, mat3, mat4, bool, sampler2D, sampler3D and the many others.
 
 
 
=== Sampling and Rendering to the Same Texture ===
 
 
Normally, you should not sample a texture and render to that same texture at the same time. This would give you undefined behavior. It might work on some GPUs and with some driver version but not others.
 
Normally, you should not sample a texture and render to that same texture at the same time. This would give you undefined behavior. It might work on some GPUs and with some driver version but not others.
  
The extension [[http://www.opengl.org/registry/specs/NV/texture_barrier.txt GL_NV_texture_barrier]] can be used to avoid this in certain ways. Specifically, you can use the barrier to ping-pong between two regions of the same texture without having to switch textures or buffers or anything. You still don't get to read and write to the same location in a texture at the same time, though.
+
The extension {{extref|texture_barrier|NV}} can be used to avoid this in certain ways. Specifically, you can use the barrier to ping-pong between two regions of the same texture without having to switch textures or buffers or anything. You still don't get to read and write to the same location in a texture at the same time unless there is only a single read and write of each texel, and the read is in the fragment shader invocation that writes the same texel.
  
 +
[[Category:OpenGL Shading Language]]
 
[[Category:Best Practices]]
 
[[Category:Best Practices]]

Revision as of 00:04, 9 October 2012

The following article discusses common mistakes made in the OpenGL Shading Language, GLSL.

Uniforms

How to use glUniform

If you look at all the glUniform*v functions, there is a parameter called count.

What's wrong with this code? Would it cause a crash?

 //Vertex Shader
 uniform vec4 LightPosition;
 //In your C++ code
 float light[4];
 //Fill in `light` with data.
 glUniform4fv(MyShader, 4, light);

The problem is that for count, you set it to 4 while it should be 1 because you are sending 1 vec4​ to the shader. The count is the number of that type (4f, which corresponds to vec4​) that you are setting.

Consider this:

 //Vertex Shader
 uniform vec2 Exponents[5];
 //In your C++ code
 float Exponents[10];
 glUniform2fv(MyShader, 5, Exponents);

This is correct. The length of the array is 5, which is what we tell glUniform2fv.

glUniform doesn't work

You probably did not bind the correct shader first. Call glUseProgram(myprogram) first.

glGetUniformLocation and glGetActiveUniform

Although not strictly a mistake, some wonder why glGetUniformLocation returns -1. If there is a uniform that you are not using, the driver will optimize your uniform out. Drivers are really good at optimizing code. If you are using your uniform, but none of the values computed from that uniform contribute to any output from the shader (directly or indirectly), the uniform will usually be optimized out.

Typically this is not a problem, since if you pass -1 instead of a valid uniform location to the glUniform calls, they will quietly do nothing anyway. But you will also get -1 if you misspell the variable name to glGetUniformLocation, so keep this in mind.

glUseProgram

When should you call glUseProgram?

glUseProgram needs to be called before you setup a uniform (unless you have GL 4.1 or ARB_separate_shader_objects, and can use glProgramUniform). There are several versions of the glUniform function depending if your variable is a single float, vec2, vec3, vec4, a matrix, etc. Notice that the glUniform functions do not take the program ID (your shader) as a parameter.

Getting the location of a uniform, such as from glGetUniformLocation, does not require calling glUseProgram. glGetUniformLocation already takes the program to get the location from.

glUseProgram is needed for rendering. You must use the program you intend to render with before issuing a rendering call.

Uniform Names across shader stages

It is legal to have the same uniform defined in different shader stage.

When you call glGetUniformLocation, it will return one location. When you update the uniform with a call to glUniform, the driver takes care of sending the value for each stage (vertex shader, geometry shader, fragment shader).

This is because a GLSL program contains all of the shader stages at once. Programs do not consider uniforms in a vertex shader to be different from uniforms in a fragment shader.

Miscellaneous

Enable Or Not To Enable

With fixed pipeline, you needed to call glEnable(GL_TEXTURE_2D) to enable 2D texturing. You needed to call glEnable(GL_LIGHTING). Since shaders override these functionalities, you don't need to glEnable/glDisable. If you don't want texturing, you either need to write another shader that doesn't do texturing or you can attach a all white or all black texture, depending on your needs. You can also write one shader that does lighting and one that doesn't.

Things that are not overriden by shaders, like the alpha test, depth test, stencil test... calling glEnable/glDisable will have an effect.

Binding A Texture

NVIDIA and Types

nVidia drivers are more relaxed. For example:

float myvalue = 0;

The above is not legal according to the GLSL specification 1.10, due to the inability to automatically convert from integers (numbers without decimals) to floats (numbers with decimals). Use 0.0 instead. With GLSL 1.20 and above, it is legal because it will be converted to a float.

float myvalue1 = 0.5f;
float myvalue2 = 0.5F;

The above is not legal according to the GLSL specification 1.10. With GLSL 1.20, it becomes legal.

float texel = texture2D(tex, texcoord);

The above is wrong since texture2D returns a vec4. Do one of these instead:

float texel = texture2D(tex, texcoord).r;
float texel = texture2D(tex, texcoord).x;

Functions inputs and outputs

Functions parameters must be declared with the in​, out​, or inout​ qualifiers.

//Correct
vec4 myfunction(inout float value1, in vec3 value2, in vec4 value3);
 
//Wrong
vec4 myfunction(float value1, vec3 value2, vec4 value3);

Not Used

In the vertex shader

gl_TexCoord[0] = gl_MultiTexCoord0;

and in the fragment shader

vec4 texel = texture2D(tex, gl_TexCoord[0].xy);

zw isn't being used in the fs.
Keep in mind that for GLSL 1.30, you should define your own vertex attribute.
This means that instead of gl_MultiTexCoord0, define AttrMultiTexCoord0.
Also, do not use gl_TexCoord[0]. Define your own varying and call it VaryingTexCoord0.


Sampling and Rendering to the Same Texture

Normally, you should not sample a texture and render to that same texture at the same time. This would give you undefined behavior. It might work on some GPUs and with some driver version but not others.

The extension NV_texture_barrier can be used to avoid this in certain ways. Specifically, you can use the barrier to ping-pong between two regions of the same texture without having to switch textures or buffers or anything. You still don't get to read and write to the same location in a texture at the same time unless there is only a single read and write of each texel, and the read is in the fragment shader invocation that writes the same texel.