Deferred shading

I am just about to move our lighting into a deferred step. It seems that the only additional buffer we need to add is a RGB buffer for the normal. The existing color texture can remain, the attached depth texture can be used to reconstruct the pixel xyz position in the deferred pass, and we just need to add a normal buffer. I do not know how to add this “gbuffer” or how to write to it in the frag shader. All I know how to do is set up an FBO with a color texture and a depth attachment. Where can I find more information on this?

Dominike Göddeke has a tutorial that covers (among other GPGPU things) how to create and render to a floating point texture using FBOs.
http://www.mathematik.uni-dortmund.de/~goeddeke/gpgpu/tutorial.html#arrays5

Nvidia GDC presentation about FBOs in general
http://http.download.nvidia.com/develope…ffer_Object.pdf

GameDev.net has two tutorials on FBO, and the second cover multiple render targets
http://www.gamedev.net/reference/programming/features/fbo1/
http://www.gamedev.net/reference/programming/features/fbo2/

Not OpenGL-specific:
GPU gems 2 has a chapter on the deferred shading in the game S.T.A.L.K.E.R which I think is very informative.
GPU gems 3 has a similiar article about the game Tabula Rasa.

Why do I need floating point textures? This should be all I need:

Color - RGB
Depth - 24 bit depth (I think)
Normal - RGB

It should not be necessary to create a buffer for the fragment position, because that can be figured out from the camera FOV and aspect, fragment screen coord, and depth.

I found two PDF files talking about deferred shading, the one describing killzone 2 contains a pretty nice description on how all the color/depth buffers are laid out.

http://www.talula.demon.co.uk/DeferredShading.pdf
http://www.guerrilla-games.com/publications/dr_kz2_rsx_dev07.pdf

Here’s my code for setting up a texture-based render buffer. The buffer can have an optional color and/or depth buffer, and now I am adding a normal buffer. I have no idea how to add the normal buffer.

Do I just create another RGBA8 texture and use COLOR_ATTACHMENT1? I tried this and the FBO did not produce an error. Now how can I render to the normal texture?

	If colorbuffer
		buffer.colorbuffer=CreateTexture(tw,th,GL_RGBA,GL_TEXTURE_RECTANGLE_ARB)
		buffer.colorbuffer._width=tw
		buffer.colorbuffer._height=th	
		buffer.colorbuffer.bind()
		buffer.colorbuffer.link.remove()
		buffer.colorbuffer.link=Null
		buffer.colorbuffer.Clamp()
		buffer.colorbuffer.setfilter TEXTUREFILTER_PIXEL
		glGenFramebuffersExt 1,Varptr buffer.colorbuffer.framebuffer[0]
		buffer.framebuffer=buffer.colorbuffer.framebuffer[0]
		glTexImage2D buffer.colorbuffer.target(),0,GL_RGBA8,buffer.colorbuffer._width,buffer.colorbuffer._height,0,GL_RGB,GL_UNSIGNED_BYTE,Null	
		glBindFramebufferEXT GL_FRAMEBUFFER_EXT,buffer.colorbuffer.framebuffer[0]
		glFramebufferTexture2DEXT GL_FRAMEBUFFER_EXT,GL_COLOR_ATTACHMENT0_EXT,buffer.colorbuffer.target(),buffer.colorbuffer.index(),0
	EndIf
	
	If depthbuffer
		buffer.depthbuffer=CreateTexture(tw,th,GL_RGBA,GL_TEXTURE_RECTANGLE_ARB)
		buffer.depthbuffer._width=tw
		buffer.depthbuffer._height=th	
		buffer.depthbuffer.bind()
		buffer.depthbuffer.link.remove()
		buffer.depthbuffer.link=Null
		buffer.depthbuffer.Clamp()
		buffer.depthbuffer.setfilter TEXTUREFILTER_PIXEL
		glTexImage2D buffer.depthbuffer.target(),0,GL_DEPTH_COMPONENT24,buffer.depthbuffer._width,buffer.depthbuffer._height,0,GL_DEPTH_COMPONENT,GL_UNSIGNED_BYTE,Null
		glFramebufferTexture2DEXT GL_FRAMEBUFFER_EXT,GL_DEPTH_ATTACHMENT_EXT,buffer.depthbuffer.target(),buffer.depthbuffer.index(),0
	EndIf
	
	If normalbuffer
		
	EndIf

----EDIT-----

Wow, that second GameDev article is really good! It is a rare thing that I come across documentation worth reading!

That was easy. I still don’t understand why people use floating-point textures and write the frag position to a buffer.

Because in early days, render-to-depth-texture (i.e. using it as zbuffer at the same time) was not possible. Also, reconstruction of the worldposition from a zbuffer is a bit more involved (but should not be that much of a problem today)

What you really should take care of is the precision of the normals. If you use RGB8-encoded normals, they are simply not enough. RGB10_A2 is quite ok (still, noticable artifacts), but needs EXTX_framebuffer_mixed_formats to be useful. RGBA16F for the normals is ok, I didn’t notice any problems with it.

What if you used the RG terms to encode the X component and the BA terms to encode the Y component? The resolution would be less than a float value but greater than a byte.

The absolute value of the Z component can then be calculated from those terms.

The setup I used was a R32F buffer for the linear depth and a R16G16B for the normals which gave high enough precision for the stuff I was doing. I did run in to trouble when trying to use the same setup on different GPUs, so I can see the benefits of not using float-textures and stick with the RGB8 format.

The absolute value of the Z component can then be calculated from those terms.
.

A square root has two solutions, which one do you chose? The normal is not always pointing outwards the screen. Think about interpolated vertex normals or normalmaps which can create normals that can greatly differ from the face normal. I wonder why people always seem to forget that when they propose the two-component-per-normal approach… (?)

If the normal isn’t pointing towards the viewer, then it would have been dismissed unless back-face culling is disabled.

Let’s say we need back-face culling disabled…well, I can make the smaller of one of the Y terms odd or even without creating too much inaccuracy, and use that as a +/- flag to indicate the z direction.

How compatible is the second color attachment with NVidia and ATI SM 3.0 cards?

If the normal isn’t pointing towards the viewer, then it would have been dismissed unless back-face culling is disabled.

You are talking about the face normal. I was talking about the per-pixel normals, which - unless you are using flatshading - matter for lighting.

Of course, if you can somehow encode the sign of the z component into the other two, then it should be possible to leave z out.

You should be able to use multiple color attachments as soon as ARB_draw_buffers is supported.

Same principle. Assuming there’s no transparency (which is a rather valid assumption, otherwise you’d need to store more than 1 normal per pixel), everything in the world with a normal that’s not facing the viewer should be invisible to the viewer. So you can take the pixels position and the two components of the normal to create a plane. Then you can select the sign of the third component so that it ends up at the same side of the plane as the viewer.

“everything in the world with a normal that’s not facing the viewer should be invisible to the viewer”

That assumption breaks with interpolated normals, and it also breaks with normal mapping.

Can’ t imagine it does. If the interpolated normals face away from the viewer I believe that those pixels should be discareded rather than shadowed.

You’re right. In this case discarding the pixels would create holes in the geometry.

In any case, a normalized three dimensional unit vector has only two variables so you can encode it with a 360 degree yaw angle and a 180 degree pitch angle.

cos and sin per-pixel X per-lightvolume? sounds like a false economy to me.

Like I said, it is possible to encode the sign of the z component of the normal in a hackish but acceptable way. And if it means I can use an RGBA texture for the normal buffer, cool. That means I only use 2 RGBA images and a depth buffer for deferred lighting, and I don’t suffer much from the bandwidth problems these techniques tend to experience. And the normal resolution would be about 0.00006 (1/128/128).

I ordered an ATI X1550 for low-end testing. Is this going to work on SM 3.0 hardware?

Well, if you’re doing deferred shading performing this conversion for a pixel only once to be reused for many light seems like a good option to me… Furthermore, although normalizing a normal vector using cube maps is slower than the normalization function in the shader these kind of lookups can still be used to perform the computationally more expensive trigonometric functions.

Semi-unrelated, but if the normal map is stored in tangent space, no sign storage is necessary, correct? (Just want to make sure my understanding is solid.)

FYI: depending on your usage, you may be able to use this:
http://code.google.com/p/lightindexed-deferredrender/