Difference between revisions of "GLSL : common mistakes"

From OpenGL.org
Jump to: navigation, search
(glUniform causes a slow down)
(glUseProgram)
(33 intermediate revisions by 9 users not shown)
Line 1: Line 1:
The following article discusses common mistakes made in the OpenGL Shading Language, GLSL.
+
The following article discusses common mistakes made in the [[OpenGL Shading Language]], GLSL.
  
 
__TOC__
 
__TOC__
  
== Enable Or Not To Enable ==
+
== Uniforms ==
With fixed pipeline, you needed to call <code>glEnable(GL_TEXTURE_2D)</code> to enable 2D texturing. You needed to call <code>glEnable(GL_LIGHTING)</code>. Since shaders override these functionalities, you don't need to glEnable/glDisable. If you don't want texturing, you either need to write another shader that doesn't do texturing or you can attach a all white or all black texture, depending on your needs. You can also write one shader that does lighting and one that doesn't.
+
  
Things that are not overriden by shaders, like the alpha test, depth test, stencil test... calling glEnable/glDisable will have an effect.
+
=== How to use glUniform ===
 +
If you look at all the {{apifunc|glUniform|*v}} functions, there is a parameter called count.
  
== Binding A Texture ==
+
What's wrong with this code? Would it cause a crash?
When you compile and link your GLSL shader, the next step is to get uniform locations for your samplers (I'm talking about texture samplers) and setup the samplers. Some people do this:
+
glUniform1i(location, textureID)
+
You can't send a GL texture ID as your sampler. A sampler should be from 0 to the max number of texture image units.<br>
+
Once you compile and link your shader, make sure that you setup all the samplers by calling
+
(assuming of course your samplers are named Texture0, Texture1 and Texture2
+
location=glGetUniformLocation(shaderProgram, "Texture0");
+
glUniform1i(location, 0);
+
location=glGetUniformLocation(shaderProgram, "Texture1");
+
glUniform1i(location, 1);
+
location=glGetUniformLocation(shaderProgram, "Texture2");
+
glUniform1i(location, 2);
+
  
To bind a texture, always use <code>glBindTexture</code>.
+
  //Vertex Shader
glActiveTexture(GL_TEXTURE0);
+
  uniform vec4 LightPosition;
glBindTexture(GL_TEXTURE_2D, textureID[0]);
+
glActiveTexture(GL_TEXTURE1);
+
glBindTexture(GL_TEXTURE_2D, textureID[1]);
+
glActiveTexture(GL_TEXTURE2);
+
glBindTexture(GL_TEXTURE_2D, textureID[2]);
+
  
or
+
  //In your C++ code
for(i=0; i<3; i++)
+
  float light[4];
{
+
  //Fill in `light` with data.
  glActiveTexture(GL_TEXTURE0+i);
+
  glUniform4fv(MyShader, 4, light);
  glBindTexture(GL_TEXTURE_2D, textureID[i]);
+
}
+
  
If you don't set the samplers properly, you might get a link failure that says
+
The problem is that for count, you set it to 4 while it should be 1 because you are sending '''1''' {{code|vec4}} to the shader. The count is the number of that type (4f, which corresponds to {{code|vec4}}) that you are setting.
Output from shader Fragment shader(s) linked, vertex shader(s) linked.
+
Validation failed - samplers of different types are bound to the same texture i
+
mage unit.
+
  
== Types ==
+
Consider this:
nVidia drivers are more relaxed. You could do
+
float myvalue = 0;
+
but this won't compile on other platforms. Use 0.0 instead.
+
Don't write <code>0.0f</code>. GLSL is not C or C++.
+
  
float texel = texture2D(tex, texcoord);
+
  //Vertex Shader
The above is wrong since <code>texture2D</code> returns a <code>vec4</code>
+
  uniform vec2 Exponents[5];
Do this instead
+
float texel = float(texture2D(tex, texcoord));
+
or
+
float texel = texture2D(tex, texcoord).r;
+
or
+
float texel = texture2D(tex, texcoord).x;
+
  
== Functions ==
+
  //In your C++ code
Functions should look like this
+
  float Exponents[10];
vec4 myfunction(inout float value1, in vec3 value2, in vec4 value3)
+
  glUniform2fv(MyShader, 5, Exponents);
instead of
+
vec4 myfunction(float value1, vec3 value2, vec4 value3)
+
  
== Not Used ==
+
This is correct. The length of the array is 5, which is what we tell {{apifunc|glUniform|2fv}}.
In the vertex shader
+
gl_TexCoord[0] = gl_MultiTexCoord0;
+
and in the fragment shader
+
vec4 texel = texture2D(tex, gl_TexCoord[0].xy);
+
  
zw isn't being used in the fs.<br>
+
=== glUniform doesn't work ===
Keep in mind that for GLSL 1.30, you should define your own vertex attribute.<br>
+
This means that instead of gl_MultiTexCoord0, define AttrMultiTexCoord0.<br>
+
Also, do not use gl_TexCoord[0]. Define your own varying and call it VaryingTexCoord0.
+
  
== Easy Optimization ==
+
You probably did not bind the correct shader first. Call {{apifunc|glUseProgram|(myprogram)}} first.
gl_TexCoord[0].x = gl_MultiTexCoord0.x;
+
gl_TexCoord[0].y = gl_MultiTexCoord0.y;
+
  
turns into
+
=== glGetUniformLocation and glGetActiveUniform ===
gl_TexCoord[0].xy = gl_MultiTexCoord0.xy;
+
Although not strictly a mistake, some wonder why {{apifunc|glGetUniformLocation}} returns -1. If there is a uniform that you are not using, the driver will optimize your uniform out. Drivers are really good at optimizing code. If you are using your uniform, but none of the values computed from that uniform contribute to any output from the shader (directly or indirectly), the uniform will usually be optimized out.
  
Keep in mind that for GLSL 1.30, you should define your own vertex attribute.<br>
+
Typically this is not a problem, since if you pass -1 instead of a valid uniform location to the {{apifunc|glUniform}} calls, they will quietly do nothing anyway. But you will also get -1 if you misspell the variable name to {{apifunc|glGetUniformLocation}}, so keep this in mind.
This means that instead of gl_MultiTexCoord0, define AttrMultiTexCoord0.<br>
+
Also, do not use gl_TexCoord[0]. Define your own varying and call it VaryingTexCoord0.
+
  
== The MAD instruction ==
+
=== glUseProgram ===
MAD is short for multiply, then add. It is a special floating point circuit. Very fast. Costs 1 GPU cycle.<br>
+
When should you call {{apifunc|glUseProgram}}?
vec4 result1 = (value / 2.0) + 1.0;
+
vec4 result2 = (value / 2.0) - 1.0;
+
vec4 result3 = (value / -2.0) + 1.0;
+
  
The above doesn't quite easily turn into a MAD. It might be compiled to a reciprocal, then add. That might cost 2 or more cycles.
+
{{apifunc|glUseProgram}} needs to be called before you setup a uniform (unless you have GL 4.1 or {{extref|separate_shader_objects}}, and can use {{apifunc|glProgramUniform}}). There are several versions of the {{apifunc|glUniform}} function depending if your variable is a single float, vec2, vec3, vec4, a matrix, etc. Notice that the {{apifunc|glUniform}} functions do not take the program ID (your shader) as a parameter.
Below is GLSL code that converts to a single MAD instruction (for each line of code of course)
+
vec4 result1 = (value * 0.5) + 1.0;
+
vec4 result2 = (value * 0.5) - 1.0;
+
vec4 result3 = (value * -0.5) + 1.0;
+
  
== More MAD ==
+
Getting the location of a uniform, such as from {{apifunc|glGetUniformLocation}}, does not require calling {{apifunc|glUseProgram}}. {{apifunc|glGetUniformLocation}} already takes the program to get the location from.
One expression might be better than the other.
+
result = 0.5 * (1.0 + variable);
+
which compiles to
+
ADD  temp, 1.0, variable;
+
MUL  result, temp, 0.5;
+
  
Compare the above with this
+
{{apifunc|glUseProgram}} is needed for rendering. You must use the program you intend to render with before issuing a [[Vertex Rendering|rendering call]].
result = 0.5 + 0.5 * variable;
+
which compiles to
+
MAD result, variable, 0.5, 0.5;
+
  
Of course, your GLSL compiler might be smart enough and optimize the above simple example for you but code it right yourself!
+
=== Uniform Names across shader stages ===
  
== Linear Interpolation, lerp, mix ==
+
It is legal to have the same uniform defined in different shader stage.
This is more about being aware of built in functions of GLSL and making use of them so that your GLSL compiler easily generates the "low level hardware executable".
+
Blending 2 values based on some factor
+
vec3 colorRGB_0, colorRGB_1;
+
float alpha;
+
resultRGB = colorRGB_0 * (1.0 - alpha) + colorRGB_1 * alpha;
+
which can be simplified to
+
resultRGB = colorRGB_0  + alpha * (colorRGB_1 - colorRGB_0);
+
  
and of course, GPUs have a special instruction just for this common case. GLSL calls it mix while other languages like Cg calls it lerp.
+
When you call {{apifunc|glGetUniformLocation}}, it will return one location. When you update the uniform with a call to {{apifunc|glUniform}}, the driver takes care of sending the value for each stage (vertex shader, geometry shader, fragment shader).
resultRGB = mix(colorRGB_0, colorRGB_1, alpha);
+
  
== GL_LINE_SMOOTH, GL_POINT_SMOOTH , GL_POLYGON_SMOOTH ==
+
This is because a GLSL program contains all of the shader stages at once. Programs do not consider uniforms in a vertex shader to be different from uniforms in a fragment shader.
If you have any of these enabled, it is known that this causes software mode rendering on ATI/AMD.<br>
+
You should not even need these. Enable fulscreen MSAA instead.
+
  
== Compile GLSL ==
+
== Miscellaneous ==
This should be in the FAQ but for now, we'll leave it here.<br>
+
<b>Can you compile a GLSL program using some offline compiler?</b><br>
+
Yes, by using the Cg compiler.<br>
+
The Cg compiler not only compiles Cg code, but it can also do translations from one language to another, also called the target language.<br>
+
You would have to download the Cg package which contains the compiler from developer.nvidia.com<br>
+
Once install, using the command line, type
+
cgc -oglsl -profile vp40 test.glsl_vs
+
  
That means your file test.glsl_vs contains your GLSL vertex shader and your target is GL_NV_vertex_program4 (also called vp40), so it should print out the shader on screen.<br>
+
=== Enable Or Not To Enable ===
The Cg compiler also supports other targets like arbvp and arbfp and vp10, vp20, vp30, and the fp version as well.<br>
+
With fixed pipeline, you needed to call <code>glEnable(GL_TEXTURE_2D)</code> to enable 2D texturing. You needed to call <code>glEnable(GL_LIGHTING)</code>. Since shaders override these functionalities, you don't need to glEnable/glDisable. If you don't want texturing, you either need to write another shader that doesn't do texturing or you can attach a all white or all black texture, depending on your needs. You can also write one shader that does lighting and one that doesn't.
<br>
+
There is no official offline compiler for OpenGL. The ARB didn't intend for this. Normally, you would just send your GLSL code to GL and the GL driver compiles the program, generates a GPU specific binary code, which eventually gets uploaded to the GPU when you decide to use it.<br>
+
<br>
+
<font color="#FF0000">Keep in mind that if you send the GLSL shader code to GL, the driver compiles using the CPU. Compiling a shader is slow. The longer the shader, the more CPU time it takes. The driver does its best to optimize your code and to get rid of dead code. The more shaders you have, the more time it takes. If each shader takes 100ms to compile and you have 10 shaders, it will take 1 second. 60 shaders will take 1 minute. 600 shaders will take 10 minutes. These are rough estimates. They are intended to inform you that you'll be facing a problem when you write a large project.</font>
+
  
== glUniform doesn't work ==
+
Things that are not overriden by shaders, like the alpha test, depth test, stencil test... calling glEnable/glDisable will have an effect.
You probably did not bind the correct shader first. Call glUseProgram(myprogram) first.
+
  
== glUniform causes a slow down ==
+
=== Binding A Texture ===
This should go in the FAQ but we'll leave it here.<br>
+
{{main|GLSL_Sampler#Binding_textures_to_samplers}}
All the glUniform calls are relatively fast except that it has been reported that on some nVidia drivers, when certain values are sent to the shader, the driver recompiles and reoptimizes your shader. This is obviously a problem for games. Values are 0.0, 0.5, 1.0. There is no solution other than to avoid those exact numbers. Has nVidia solved this issue in recent drivers? Unknown.
+
  
== How to use glUniform ==
+
=== NVIDIA and Types ===
If you look at all the glUniform functions (glUniform1fv, glUniform2fv, glUniform3fv, glUniform4fv, glUniform1iv, glUniform2iv, glUniform3iv, glUniform4iv, glUniformMatrix4fv and the many others), there is a parameter called count.<br>
+
<br>
+
What's wrong with this code? Would it cause a crash?<br>
+
  //Vertex Shader
+
  uniform vec4 LightPosition;
+
  //In your C++ code
+
  float light[4];
+
  glUniform4fv(MyShader, 4, light);
+
  
The problem is that for count, you set it to 4 while it should be 1 because you are sending 1 vec4 to the shader.<br>
+
nVidia drivers are more relaxed. For example:
What's wrong with this code? Would it cause a crash?<br>
+
  //Vertex Shader
+
  uniform vec2 Exponents;
+
  //In your C++ code
+
  float Exponents[2];
+
  glUniform2fv(MyShader, 2, Exponents);
+
  
The problem is that for count, you set it to 2 while it should be 1 because you are sending 1 vec2 to the shader.<br>
+
float myvalue = 0;
What's wrong with this code? Would it cause a crash?<br>
+
 
  //Vertex Shader
+
The above is not legal according to the GLSL specification 1.10, due to the inability to automatically convert from integers (numbers without decimals) to floats (numbers with decimals). Use 0.0 instead. With GLSL 1.20 and above, it is legal because it will be converted to a float.
  uniform vec2 Exponents[5];
+
 
  //In your C++ code
+
float myvalue1 = 0.5f;
  float Exponents[10];
+
float myvalue2 = 0.5F;
  glUniform2fv(MyShader, 5, Exponents);
+
 
 +
The above is not legal according to the GLSL specification 1.10. With GLSL 1.20, it becomes legal.
 +
 
 +
float texel = texture2D(tex, texcoord);
 +
 
 +
The above is wrong since <code>texture2D</code> returns a <code>vec4</code>. Do one of these instead:
 +
 
 +
float texel = texture2D(tex, texcoord).r;
 +
float texel = texture2D(tex, texcoord).x;
 +
 
 +
=== Functions inputs and outputs ===
 +
 
 +
Functions parameters must be declared with the {{code|in}}, {{code|out}}, or {{code|inout}} qualifiers.
 +
 
 +
<source lang="glsl">
 +
//Correct
 +
vec4 myfunction(inout float value1, in vec3 value2, in vec4 value3);
 +
 
 +
//Wrong
 +
vec4 myfunction(float value1, vec3 value2, vec4 value3);
 +
</source>
 +
 
 +
=== Not Used ===
 +
In the vertex shader
 +
gl_TexCoord[0] = gl_MultiTexCoord0;
 +
and in the fragment shader
 +
vec4 texel = texture2D(tex, gl_TexCoord[0].xy);
 +
 
 +
zw isn't being used in the fs.<br>
 +
Keep in mind that for GLSL 1.30, you should define your own vertex attribute.<br>
 +
This means that instead of gl_MultiTexCoord0, define AttrMultiTexCoord0.<br>
 +
Also, do not use gl_TexCoord[0]. Define your own varying and call it VaryingTexCoord0.
 +
 
 +
 
 +
=== Sampling and Rendering to the Same Texture ===
 +
{{main|Framebuffer_Object#Feedback_Loops}}
 +
 
 +
Normally, you should not sample a texture and render to that same texture at the same time. This would give you undefined behavior. It might work on some GPUs and with some driver version but not others.
  
There is nothing wrong with it. We want to send 5 values of vec2.
+
The extension {{extref|texture_barrier|NV}} can be used to avoid this in certain ways. Specifically, you can use the barrier to ping-pong between two regions of the same texture without having to switch textures or buffers or anything. You still don't get to read and write to the same location in a texture at the same time unless there is only a single read and write of each texel, and the read is in the fragment shader invocation that writes the same texel.
  
== Uniform Names in VS and FS ==
+
[[Category:OpenGL Shading Language]]
This should go in the FAQ but we'll leave it here.<br>
+
[[Category:Best Practices]]
So what happens if you have the same exact uniform name in both the vertex shader and fragment shader? Geometry shaders aren't core yet but the same applies for this stage. In the future, other stages might be added as well like blend shader, hull shader, etc.<br>
+
<br>
+
Yes, it is legal to have the same uniform name in all shaders.<br>
+
When you call glGetUniformLocation, it will return one location. When you update the uniform with a call to glUniform, the driver takes care of sending the value for each stage (vertex shader, fragment shader, geometry shader, etc).<br>
+
<br>
+
Keep in mind that this applies to all uniforms : float, vec2, vec3, vec4, mat3, mat4, bool, sampler2D, sampler3D and the many others.
+

Revision as of 00:04, 9 October 2012

The following article discusses common mistakes made in the OpenGL Shading Language, GLSL.

Uniforms

How to use glUniform

If you look at all the glUniform*v functions, there is a parameter called count.

What's wrong with this code? Would it cause a crash?

 //Vertex Shader
 uniform vec4 LightPosition;
 //In your C++ code
 float light[4];
 //Fill in `light` with data.
 glUniform4fv(MyShader, 4, light);

The problem is that for count, you set it to 4 while it should be 1 because you are sending 1 vec4​ to the shader. The count is the number of that type (4f, which corresponds to vec4​) that you are setting.

Consider this:

 //Vertex Shader
 uniform vec2 Exponents[5];
 //In your C++ code
 float Exponents[10];
 glUniform2fv(MyShader, 5, Exponents);

This is correct. The length of the array is 5, which is what we tell glUniform2fv.

glUniform doesn't work

You probably did not bind the correct shader first. Call glUseProgram(myprogram) first.

glGetUniformLocation and glGetActiveUniform

Although not strictly a mistake, some wonder why glGetUniformLocation returns -1. If there is a uniform that you are not using, the driver will optimize your uniform out. Drivers are really good at optimizing code. If you are using your uniform, but none of the values computed from that uniform contribute to any output from the shader (directly or indirectly), the uniform will usually be optimized out.

Typically this is not a problem, since if you pass -1 instead of a valid uniform location to the glUniform calls, they will quietly do nothing anyway. But you will also get -1 if you misspell the variable name to glGetUniformLocation, so keep this in mind.

glUseProgram

When should you call glUseProgram?

glUseProgram needs to be called before you setup a uniform (unless you have GL 4.1 or ARB_separate_shader_objects, and can use glProgramUniform). There are several versions of the glUniform function depending if your variable is a single float, vec2, vec3, vec4, a matrix, etc. Notice that the glUniform functions do not take the program ID (your shader) as a parameter.

Getting the location of a uniform, such as from glGetUniformLocation, does not require calling glUseProgram. glGetUniformLocation already takes the program to get the location from.

glUseProgram is needed for rendering. You must use the program you intend to render with before issuing a rendering call.

Uniform Names across shader stages

It is legal to have the same uniform defined in different shader stage.

When you call glGetUniformLocation, it will return one location. When you update the uniform with a call to glUniform, the driver takes care of sending the value for each stage (vertex shader, geometry shader, fragment shader).

This is because a GLSL program contains all of the shader stages at once. Programs do not consider uniforms in a vertex shader to be different from uniforms in a fragment shader.

Miscellaneous

Enable Or Not To Enable

With fixed pipeline, you needed to call glEnable(GL_TEXTURE_2D) to enable 2D texturing. You needed to call glEnable(GL_LIGHTING). Since shaders override these functionalities, you don't need to glEnable/glDisable. If you don't want texturing, you either need to write another shader that doesn't do texturing or you can attach a all white or all black texture, depending on your needs. You can also write one shader that does lighting and one that doesn't.

Things that are not overriden by shaders, like the alpha test, depth test, stencil test... calling glEnable/glDisable will have an effect.

Binding A Texture

NVIDIA and Types

nVidia drivers are more relaxed. For example:

float myvalue = 0;

The above is not legal according to the GLSL specification 1.10, due to the inability to automatically convert from integers (numbers without decimals) to floats (numbers with decimals). Use 0.0 instead. With GLSL 1.20 and above, it is legal because it will be converted to a float.

float myvalue1 = 0.5f;
float myvalue2 = 0.5F;

The above is not legal according to the GLSL specification 1.10. With GLSL 1.20, it becomes legal.

float texel = texture2D(tex, texcoord);

The above is wrong since texture2D returns a vec4. Do one of these instead:

float texel = texture2D(tex, texcoord).r;
float texel = texture2D(tex, texcoord).x;

Functions inputs and outputs

Functions parameters must be declared with the in​, out​, or inout​ qualifiers.

//Correct
vec4 myfunction(inout float value1, in vec3 value2, in vec4 value3);
 
//Wrong
vec4 myfunction(float value1, vec3 value2, vec4 value3);

Not Used

In the vertex shader

gl_TexCoord[0] = gl_MultiTexCoord0;

and in the fragment shader

vec4 texel = texture2D(tex, gl_TexCoord[0].xy);

zw isn't being used in the fs.
Keep in mind that for GLSL 1.30, you should define your own vertex attribute.
This means that instead of gl_MultiTexCoord0, define AttrMultiTexCoord0.
Also, do not use gl_TexCoord[0]. Define your own varying and call it VaryingTexCoord0.


Sampling and Rendering to the Same Texture

Normally, you should not sample a texture and render to that same texture at the same time. This would give you undefined behavior. It might work on some GPUs and with some driver version but not others.

The extension NV_texture_barrier can be used to avoid this in certain ways. Specifically, you can use the barrier to ping-pong between two regions of the same texture without having to switch textures or buffers or anything. You still don't get to read and write to the same location in a texture at the same time unless there is only a single read and write of each texel, and the read is in the fragment shader invocation that writes the same texel.