Improving deferred shading

Vexator · November 12, 2007, 4:59am

i do the following to restore worldspace position for deferred shading:

initial pass, vs:

vViewPosition = gl_ModelViewMatrix*gl_Vertex;

fs:

vViewPosition /= vViewPosition.w;

//then i store (-vViewPosition.z) in the MRT

deferred lighting pass, fs:

// Depth = (-vViewPosition.z) read from MRT

float invTanHalfFOV = 1.0/tan( radians(FOV*0.5) );
vec3 Ray = vec3( ((gl_FragCoord.xy/vec2(Width, Height))-0.5)*2.0, -invTanHalfFOV );
Ray /= invTanHalfFOV;

vec4 Position = vec4( Ray*(Depth), 1.0 );

this works perfectly fine, but requires a floating-point color attachment to store (-vViewPosition.z), but i guess the same should be possible using a depth attachment. i got it working using some hacks, but lost too much precision.

any ideas?

oc2k1 · November 12, 2007, 6:37am

For both problems you should use the simples solution. To calculate the distance from a Z buffer value you have to calculate two values:
http://www.sjbaker.org/steve/omniv/love_your_z_buffer.html


     depthparameter.x = zFar / ( zFar - zNear )
     depthparameter.y = zFar * zNear / ( zNear - zFar )

Don’t to that in the shader and use doubles to preserve the precision. Pass both values as uniform vec2 to the lighting shader:


uniform sampler2DRect g_depth;
uniform vec2 DepthParameter;
....
    float Z = DepthParameter.y/(DepthParameter.x - texture2DRect(g_depth,gl_FragCoord.xy).r);

The second problem: Your FOV construct looks crazy. A much simpler and faster possibility is that:


varying vec3 unpro;
....
   vec3 ModelView = vec3 (unpro.xy/unpro.z * Z,Z);

unpro is the modelview position of the light bounding volume.
That can be calculated in the vertex shader with:


   ftransform();
   unpro = gl_ModelViewMatrix * gl_Vertex;

Draw always convex volumes around the lights (pointlight: Sphere, Spotlight: Cone). A fullscreen quad for a global light can be drawn by unprojecting the nearplane or simpler by drawing a sphere that intersects all frustum sides. (Same position as the camera and disabled Z test)

Sometimes it’s recommend or required to draw the lightvolume with Frontface culling and glDepthFunc(GL_GREATER); (required if the nearplane is in the light volume)

Xmas · November 12, 2007, 6:54am

Why do you divide view space position by view space W per fragment? You should divide in the vertex shader if you expect Wview != 1.

And why these complex calculations in the fragment shader if you could just pass in the point on the Z=1 plane through which the ray passes as interpolated coordinates?

varying vec2 rayDir;
position = vec3(rayDir * Depth, Depth);

this works perfectly fine, but requires a floating-point color attachment to store (-vViewPosition.z), but i guess the same should be possible using a depth attachment. i got it working using some hacks, but lost too much precision.

What exactly did you try? Remember that a depth attachment stores window space Z, i.e. (Zclip / Wclip) scaled and biased by the viewport transform.

oc2k1 · November 12, 2007, 7:20am

Yes, using a vec2 for the raydirection is possible if gl_Position.w is 1.0 (not unpro.w). But it could be possible that anyone need the correct interpolated modelview position too (for example a light fog interaction) or working clipplanes on ATI cards (that don’t wotk without the ftransform)
A correct solution would be to declare the raydir (or unpro) as “noperspective varying” but that would require EXT_gpu_shader4 and won’t work in a non shader model 4.0 card.

Xmas · November 12, 2007, 7:52am

I implicitly assumed that the deferred lighting pass uses a simple fullscreen quad.

CaseMillennium · November 12, 2007, 7:55am

As we speak of deferred shading: Does anybody have an idea how one would use completely different shaders with a deferred shading approach? For example I have a procedural shader for wood, another one for plastic and another one that does anisotropic metall. The only ideas I came up with so far would be a giant switch-like structure in the deferred shading shader (which probably means breaking the instruction limit) or using multipass rendering, one pass for each material, and blending the results. But this doesn´t sound too good to me either.

Are there other ways to work around this limitation of deferred shading?

Vexator · November 12, 2007, 8:33am

thanks for your quick replies!

I’ve just implemented oc2k1’s suggestion, and it almost works. the computed position is still dependent on the camera’s position/orientation.

is there a way i can do it using a simple fullscreen quad? i don’t want to rely on my light hulls, that’d be one potential error source less.

btw near plane is at 1.0, far plane at 100, in case that matters.

thanks you!

@cm: you could use the stencil buffer to mask out geometry that you want to render with another shader.

Brolingstanz · November 12, 2007, 10:03am

As we speak of deferred shading: Does anybody have an idea how one would use completely different shaders with a deferred shading approach?

I’m not exactly sure what you mean but it’s clear that DS is not without its share of pitfalls and shortcomings. Certainly not a panacea.

There’s a great chapter in GPU Gems 3 that describes many of the pros, cons, workarounds, etc given varying hw capabilities and so on.

Alpha blending (or lack thereof) figures prominently in the cons column.

Vexator · November 12, 2007, 10:50am

ok there’s definitly something wrong with the way i try to linearize depth… the first shader reads the distance from the color attachment the way i did it before (see 1st post):

void main()
{
float Depth = texture2D(u_DepthTexture, v_Coordinates).r;

gl_FragData[0] = Depth*0.01;
}

the 2nd shader reads from a depth map and linearizes it the way oc2k1 suggested:

void main()
{
float Depth = texture2D(u_DepthTexture, v_Coordinates).r;
float Z = DepthParameter.y/( DepthParameter.x-Depth );

gl_FragData[0] = (-Z)*0.01;
}

:’(

Humus · November 12, 2007, 11:12am

A small performance trick: Rewrite it on the form 1.0 / (a * z + b) to shave off one instruction from the computation. If my math is right, it would be:

a = (zNear - zFar) / (zFar * zNear);
b = -1.0 / zNear;

Humus · November 12, 2007, 11:18am

Far from all deferred shading implementations use a fullscreen quad. In fact, I’d say that’s probably limited to techdemos. Full scale implementations usually use either quads of the rectangular screen extents of each light (this is what I did in my recent deferred shading demo), or geometric representations of the light, like spheres and cones. The latter has the advantage that you can combine it with occlusion culling to find out if a light is visible or not and use stenciling to limit shading to the lit areas only.

Humus · November 12, 2007, 11:28am

This is tricky, but could be workable, depending on how diverse set of shaders and materials you plan to use. Most deferred shading implementations store a fixed set of data, but if you want something more generic you could store a material ID and then have a set of generic attributes that belongs to each material. This of course comes down to the big switch-case shader, but you may be able to merge the computations for different materials in smart ways to reduce this problem. Instead of the switch-case you may be able to store a bitfield with flags for what components to include, like diffuse, ambient, specular, anisotropic lighting etc. Could turn out heavy on dynamic branching though, but maybe less so that the big switch-case scenario.

Humus · November 12, 2007, 11:36am

Try this math and see if it helps:
http://www.humus.ca/temp/Linearize%20depth.txt

I haven’t actually verified the GL math so it might be wrong, but the DX stuff has worked for me.

Xmas · November 12, 2007, 12:45pm

I guess it depends on how many light sources you have, how far they extend and how complex the lighting equation is. If the latter is simple then bandwidth consumption becomes more important and it’s good to combine several lights in a single pass.

In the last sentence, do you mean rendering the light extends just to get visibility/lit areas, not for calculating the lighting itself?

btw. one of the best ways to store Z is to use a float depth buffer (unfortunately only a G80 extension currently) and store 1/Zeye.

Vexator · November 12, 2007, 1:38pm

these are my (simplified) shaders atm. do you see any obvious problems? the computed position still depends on both the camera’s and the light’s position/orientation:

void main()
{
	unpro = ( gl_ModelViewMatrix* gl_Vertex ).xyz;
	gl_Position = gl_ModelViewProjectionMatrix*gl_Vertex;
}

void main()
{
	float zNear = 1.0;
	float zFar = 100.0;

	vec2 DepthParameter;
	DepthParameter.x = zFar/( zFar-zNear );
	DepthParameter.y = zFar*zNear/( zNear-zFar );

	vec2 Coordinates = gl_FragCoord.xy*vec2( 1.0/u_Width, 1.0/u_Height );

	// reconstruct position
	float Depth = texture2D( u_DepthTexture, Coordinates ).r;
	float LinearDepth = DepthParameter.y/( DepthParameter.x-Depth );
	vec3 Position = vec3( unpro.xy/unpro.z*LinearDepth, LinearDepth );

	// fetch normal
	vec3 Normal = texture2D( u_NormalHeightTexture, Coordinates ).rgb;

	// compute light vector
	vec3 LightVector = normalize( u_LightPosition-Position );

	// compute normal * light vector
	float NdotL = dot( Normal, LightVector );

	// output final color
	gl_FragData[0] = vec4( vec3(NdotL), 1.0 );
}

Humus · November 13, 2007, 12:31pm

For both the lit-tagging pass and the actual lighting pass.

Vexator · November 13, 2007, 1:04pm

The latter has the advantage that you can combine it with occlusion culling to find out if a light is visible or not and use stenciling to limit shading to the lit areas only.

how can you do that if you don’t have the scene’s depth buffer in the lighting pass?

anyway, can noone tell me what i’m doing wrong here?

Xmas · November 13, 2007, 3:20pm

So basically: Render back faces with stencil set on depth fail, render front faces with stencil+depth test, stencil reset and do lighting?

zeoverlord · November 13, 2007, 6:24pm

deffered shading requires you to split up the data into different textures, so basically you output texture color into one texture shininess, glossiness and into another, normals into a third and so on(usually compressed together in some way).
So in the first step you can use several textures.
Then in the combiner shader you can then simulate most kinds of materials using only one universal material shader.

Humus · November 14, 2007, 1:11pm

You DO have the scene’s depth buffer available. For the light-tagging and/or occlusion culling passes you use it as a depth buffer and for the lighting pass you use it as a texture.

It’s not immediately obvious why it’s not working for you. I haven’t done any deferred shading in OpenGL, only in DX. It should be similar, but not the same since GL and DX treat Z a bit differently.