PDA

View Full Version : deferred rendering and framebuffer bandwith



adamce
10-06-2014, 02:06 AM
hi,
i want to implement deferred rendering for a university course.



The first stage of a deferred renderer is to create the G-buffer, which is
implemented using a framebuffer object with several attachments.
OpenGL can support framebuffers with up to eight attachments, and each
attachment can have up to four 32-bit channels (using the GL_RGBA32F
internal format, for example). However, each channel of each attachment
consumes some memory bandwidth, and if we donít pay attention to the
amount of data we write to the framebuffer, we can start to outweigh the
savings of deferring shading with the added cost of the memory
bandwidth required to save all of this information.


They continue with a somewhat elaborated attachment scheme, where they for instance span the normal over two attachments.

I understand that bandwidth can be valuable, but does it really matter if i have two 32bit * 4channels attachments or lets say eight 8bit * 4channels attachments. both of them would add up to 256 bit per fragment output.

or do the graphics card vendors implement the attachments in a way , that they aren't packed nicely and 8bits would be extended to 32? Maybe it's only about the number of channels per attachment and it would extend a vec3 to vec4?

thanks,
adam

Dark Photon
10-07-2014, 07:45 PM
i want to implement deferred rendering for a university course.


G-buffer... several attachments...each channel of each attachment consumes some memory bandwidth, and if we don’t pay attention to the amount of data we write to the framebuffer, we can start to outweigh the savings of deferring shading with the added cost of the memory bandwidth ...

They continue with a somewhat elaborated attachment scheme, where they for instance span the normal over two attachments.

I understand that bandwidth can be valuable, but does it really matter if i have two 32bit * 4channels attachments or lets say eight 8bit * 4channels attachments. both of them would add up to 256 bit per fragment output.

What you have to remember is what is actually stored in the G-buffer is "not" necessarily the format that's output at the tail-end of your fragment shader. The GPU does run-time format conversion to map the float/vec* output of the frag shader in your G-buffer rasterization pass to the format(s) in your FBO attachments. That's what actually gets written to memory, and as importantly it's what format gets "read" later from memory when you go to apply your lighting pass(es). So reducing how many bits you use for each component in your G-buffer can save you GPU write and read bandwidth.