Is it still bad to do all lighting in a single pass?

we have had several discussions about this already, and the conclusion has always been that it’s still better to render one pass per light and blend those together (if the number of light sources is not fixed).
so far i didn’t have another choice as i was using stencil shadow volumes, but now that i have switche dto shadow maps the idea is tempting to do it in one pass. atm i have 2*n+1 passes, n being the number of lights: 1 ambient/depth pass, n passes to update the shadow maps (if necessary) and n to render the actual lighting… needing only n+1 passes would improve performance a lot - right? is sm3.0 hardware and their drivers able to handle for-loops efficiently meanwhile?

Well yes and no, one problem is that you might run into some texture map limits if you try to do all lighting in one go.
But it would certainly be ok to put a few passes into one.

You might want to look into something called deferred Shading.

I think that a best (cheapest) approach for scenes with moderate to high overdraw today is to render everything in a single pass (laying out depth first) + use forward shadow mapping. Single pass will be always faster, because you’ll have to rasterize the geometry and sample the material textures only once. Defferred lighting is an extremely cool approach, but unfortunately, it has problems with antialiasing and transparency.

Deferred shading looks even better with the 8 render targets on the G80.

  • ok i had to realize that doing it all in one single pass won’t work as i’d run out of uniforms and texture units, especially as i do hardware skinning.
    plus, it is still not possible to use a uniform as a loop iterator (i get no compilation errors in glslvalidate, but the shader doesn’t run on my gf7800gt)

what about the folllowing: i create an array of shaders, one for each light count, say 1-6.
so i can then choose the shader according to my light count and if there are more lights than one single shader can handle, i render an additional pass for the other lights, etc and blend those together, just like overlord suggested. that would msave at least some passes. for this, i need to be able to determine how many uniforms and texture fetches a card can handle, so that i know hoe many shaders to create. is there a way to do so?

No need to store position as a vector3:
– Each 2D screen pixel corresponds to a ray from the eyepoint into the 3D world (think raytracing)
– You inherently know the 2D position of each screen pixel, and you know the camera settings
– So if you write out the distance along this ray, you can reconstruct the original worldspace position
i don’t really understand how to do this. could someone explain?

  • oh and one more thing: i thought about compiling one shader per material, using #defines instead of uniforms. this would result in a high shader objects count, and if i really go for the first solution and create different shaders for each light count, then i would have a total shader objects count of material_count*max_lights_per_pass Oo is it even possible to create/compile so many shaders?

thank you :slight_smile:

To reconstruct the position, the inverse projection matrix is needed. build a vector from the 2D screencoords (range -1…1) and depth*2-1 (range -1…1)and 1.0 as last component.

Multiply the matrix with that vector to unproject the ray.

4 rendertargets (or 5 with the Z-Buffer) are in many cases enough for deferred shading. A bigger problem is a good material system. Specially anisotropic materials are a problem. Without anisotropic materials it’s not a problem to use simplified BRDFs with 1D lookup textures stacked in 2D textures…

Back to Single pass for all: Without pre Z pass it’s extrem slow. For example: A shader for 8 lightsources, shadows and anisotropic materials needs round about 280 instructions. Optimized with dynamic branches the compiler wastes up to 400 (if the dynamic branching skip less than 50% it will be slower). A realistic overdraw is between 5 and 10, so up to 4000 instructions are needed for one pixel…

Originally posted by Vexator:
[b]

  • oh and one more thing: i thought about compiling one shader per material, using #defines instead of uniforms. this would result in a high shader objects count, and if i really go for the first solution and create different shaders for each light count, then i would have a total shader objects count of material_count*max_lights_per_pass Oo is it even possible to create/compile so many shaders?
    [/b]
    It is possible however there are some disadvantages.

The compilation of high number of shaders will take long time and because the current API does not support saving of compiled shader object, your application startup time can increase by several minutes depending on shader count and complexity.

If the driver decides to be “smart” a reoptimizes the shaders on the fly when uniforms change, it can hurt you badly when combination of lights (including theirs assignment to shader uniforms) that was not used before is introduced.

@oc2k1
ok, so i could go for the multiple lights per pass, but i should definitly have an initial depth-only pass, even if the number of lights is so low that i could handle them all in one pass, right?

i’ll give deferred shading a try, thanks for clearing this up :slight_smile:

@komat
thanks… i have expected sth like that :frowning:

/edit: d’oh… i just realized that the problem are not uniforms or texture fetches, but varyings! my card supports “only” 32 varying floats (which is the minimum according to the standard), but i need already 8 per light source (a lot of spot light and attenuation related stuff) and 10 for general stuff. so i can handle 2 lights maximum per pass. that’s not worth the effort.

Originally posted by Vexator:

/edit: d’oh… i just realized that the problem are not uniforms or texture fetches, but varyings! my card supports “only” 32 varying floats (which is the minimum according to the standard), but i need already 8 per light source (a lot of spot light and attenuation related stuff) and 10 for general stuff. so i can handle 2 lights maximum per pass. that’s not worth the effort.

You can avoid the varying limit at the cost of increased per pixel cost by moving all per light calculations to the pixel shader and providing it with position of the fragment in light space (whichever space in which you are storing the light parameters in uniforms) and tangent to light space matrix.

Calculate the lights in worldspace and not in texture space… In this case it’s only needed to project the normalmap from texturespace to world space (3 instructions and the result can be used for all lights). There is only the fragments worldspace position as varying needed.

For an example look at my shadow example in Lumina

If you have spotlights or lights that doesn’t lights the full viewport, deferrred shading could be much faster.

i looked through your examples but i frankly have to admit that i not understand what you mean - if it’s easier for you to explain like this, here is my experimental vs code:

varying vec4 v_Projection;
varying vec2 v_Coordinates;

varying vec3 v_ViewDirection;

attribute vec3 a_Normal;
attribute vec3 a_Tangent;
attribute vec3 a_Bitangent;
attribute vec2 a_Coordinates;

uniform mat4 u_ViewMatrix;
uniform mat4 u_ViewMatrixInverse;
uniform mat4 u_ProjectionMatrix;

uniform vec3 u_CameraPosition;

#define LIGHT_COUNT 2

varying vec4 v_Projections[LIGHT_COUNT];
varying vec3 v_LightDirections[LIGHT_COUNT];
varying float v_Distances[LIGHT_COUNT];

uniform vec3 u_LightPosition[LIGHT_COUNT];
uniform mat4 u_LightMatrix[LIGHT_COUNT];

void main()
{	
	// pass on texture coordinates
	v_Coordinates = a_Coordinates;

	// get input vertex
	vec4 Vertex = gl_Vertex;
	
	// transform by inverse view matrix
	Vertex = u_ViewMatrixInverse*Vertex;

	// compute view direction
	vec3 ViewDirection = u_CameraPosition-Vertex.xyz;

	// compute normal matrix
	mat3 NormalMatrix = transpose( u_ViewMatrix );

	// compute normalized TBN matrix
	vec3 Normal = normalize( NormalMatrix*a_Normal );
	vec3 Tangent = normalize( NormalMatrix*a_Tangent );
	vec3 Bitangent = normalize( NormalMatrix*a_Bitangent );

	// compute view direction
	v_ViewDirection.x = dot( Tangent, ViewDirection );
	v_ViewDirection.y = dot( Bitangent, ViewDirection );
	v_ViewDirection.z = dot( Normal, ViewDirection );

	// project to screen
	Vertex = u_ProjectionMatrix*Vertex;

	// output vertex
	gl_Position = Vertex;

	for( int i = 0; i < LIGHT_COUNT; i++ )
	{
		// compute light direction
		vec3 LightDirection = u_LightPosition[i]-Vertex.xyz;

		// compute light distance
		v_Distances[i] = length( LightDirection );

		v_LightDirections[i].x = dot( Tangent, LightDirection );
		v_LightDirections[i].y = dot(Bitangent, LightDirection );
		v_LightDirections[i].z = dot( Normal, LightDirection );

		// compute projection plane
		v_Projections[i] = ( u_LightMatrix[i]*(u_ViewMatrix*(Vertex)) );
	}
}

That way you will run out of varyings really soon.
You should just pass yout TBN matrix as varying variable and transform the normal to the world space per-pixel. Or forget the normal mapping and use Blinn-style normal perturbation. Like normal = normalize(normal + tangentdu + bitangentdv), where du and dv are values that control the bending of the normal in direction of tangent/bitangent.

calculate Bitangent with a crossproduct, that is much cheaper…

And you shouldn’t use the transpose() function, it’s better to transpose it on CPU side.

u_CameraPosition ? The cam is allways located on (0;0;0). Move the world not the cam…

Another question: Why did you use other uniforms than the GLSL defaults?

ok i’m going to implement your suggestions, thanks!

Why did you use other uniforms than the GLSL defaults?
i’m not using any fixed pipeline functionality, so i don’t want to use fixed variables either. i think it’s just cleaner this way. i’m also not using the built-in modelview & projection matrix stacks, so i wouldn’t be sure if the glsl default variables behaved as expected anyway.

I use stencil shadows. If there are multiple lights, the scene will be divided into several regions lighted by diffrent lights. How do you decide which number of lights passed to the shader in order to render a certain surface correctly? I think multiple passes is a better way. On the other hand, if some of the lights are diffrerent types from the others( directional, parallel, spot, or even some special defines by yourself, like projecting lightMap insead ), I don’t think putting them all in a single shader is
a good choice, it decreases the flexibility somehow.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.