FBOs

Pretty much like everybody, I’ve been using FBOs for a long time. Then today I got thinking and realized I did not understand everything in FBOs, and particularly the “interest” of some of the features of the API. Maybe it was done to be future proof and has no immediate interest, I don’t know, but here are some questions that are kind of puzzling me:

  1. What is the interest of creating more than 1 FBO ? Even if you want to render to many textures per frame, you can just use the same FBO and bind/unbind the color/depth textures/buffers. It’s also much faster since on Vista, a FBO switch is, as I discovered, extremely slow.

  2. What is the interest of using render buffers in place of textures ? Think of using a depth texture instead of a depth buffer everywhere: you get the ability to use this depth texture in other passes for free. Is rendering to a texture slower than rendering to a buffer ? (Note: the only exception I can understand is for using multi-sampling, then you have to use an intermediary buffer).

Y.

1 - even if you prefer to bind textures dynamically, you can still have an FBO with different number of bindng points:
FBO 1: depth, color0, color1
FBO 2: depth, color
FBO 3: depth
FBO 4: color

2 - render buffers may be faster (early Z, memory alignment) and can allow more features (multisampling). In other words - they’re usually more hardware-friendly.

Rendering to a render-buffer is usually better, if you don’t need the result as a texture, since the hardware can use the most suitable format for the renderbuffer.

For example, when you need a depth-buffer, rendering to a depth-texture can be slow, since the texture has a certain format (ie. 16 Bit or 24 Bit). Also a renderbuffer could contain additional information, to enable early z-culling etc.

It is very likely, that some hardware is not able to render to any texture-format and therefore uses a renderbuffer behind the scenes, and only copies the result to a texture as a final step (e.g. when issue a rendering-command, that uses the texture, or you swap FBOs, etc.).

There don’t has to be a difference, between rendering to a texture or a renderbuffer, but by having the semantic distinction in the API, drivers are able to choose the most optimal way for the given hardware.

Jan.

Originally posted by Ysaneya:
[b]

  1. What is the interest of creating more than 1 FBO ? Even if you want to render to many textures per frame, you can just use the same FBO and bind/unbind the color/depth textures/buffers. It’s also much faster since on Vista, a FBO switch is, as I discovered, extremely slow.
    [/b]
    When you have more than a single RenderTexture, say a dynamic env cube map, some shadow maps (more than one light), you’d better have one FBO for each conceptual thing.
    (ie one light’s shadow map is one FBO, the common dynamic env cube map is one other…)
    Switching attachments in a FBO is VERY slow.
    (I think I measured as much as 20ms)
    [An alternative is to use multi draw buffers to change the target inside a single FBO, works for color only though, since there’s only one attachment point for Depth. That’s the reason I prefer the previous approach.]

[b]
2) What is the interest of using render buffers in place of textures ? Think of using a depth texture instead of a depth buffer everywhere: you get the ability to use this depth texture in other passes for free. Is rendering to a texture slower than rendering to a buffer ? (Note: the only exception I can understand is for using multi-sampling, then you have to use an intermediary buffer).

Y. [/b]
Well for the previously mentionned Dynamic Env CubeMap, I obviously need a depth buffer, but don’t care about its content, so I create it as a RenderBuffer.
I tend to just use what seems correct given my usage, if I need to use it as a texture, I’ll make it a RenderTexture, otherwise, a RenderBuffer; it might have some effect on how the driver handle it, so better be accurate.

Originally posted by k_szczech:
1 - even if you prefer to bind textures dynamically, you can still have an FBO with different number of bindng points:
FBO 1: depth, color0, color1
FBO 2: depth, color
FBO 3: depth
FBO 4: color

… or you can have a single FBO and do the same by attaching/dettaching buffers and textures to get the intended combination.

Originally posted by k_szczech:
2 - render buffers may be faster (early Z, memory alignment) and can allow more features (multisampling). In other words - they’re usually more hardware-friendly.
I agree for multi-sampling, but your “performance” explanations don’t make sense to me. You still get Early-Z when you render to a depth texture (for example when you want to do show-mapping).

Originally posted by Roderic (Ingenu):
Switching attachments in a FBO is VERY slow.
(I think I measured as much as 20ms)
Not true. In fact, what I’ve heard is that generally, changing an attachment tends to be slightly faster than binding an entirely new FBO.

However, I wouldn’t be surprised if changing four attachments is slower than binding a single alternate FBO.

Neither of these operations is slow on their own. Both are a matter of microseconds----sometimes even nanoseconds. However, pulling a texture into video memory that hasn’t been used before is slow, and this needs to be done sometimes when binding an FBO or changing an attachment. This overhead can pollute timings if you don’t know to look for it.

If you run your program 5 or 6 times in a row and a given step still takes a long time, then it’s probably truely slow. But FBO binding and attachment swapping won’t be remotely slow at this point.

Originally posted by Ysaneya:
2) What is the interest of using render buffers in place of textures ? Think of using a depth texture instead of a depth buffer everywhere: you get the ability to use this depth texture in other passes for free.
Hardware likes to use depth compression because it saves a lot of bandwidth. However, if you want to read such a buffer as a texture, it’ll have to be in a format that the texture unit can read. Generally you don’t want to implement a good chunk of the Z hardware in the texture units too. Instead the buffer would be decompressed, which is a costly operation, especially if you render, read, render, read etc. repeatedly. In many cases the driver will just choose to disable depth compression all together on the depth textures. Depth textures may also be more limited in terms of tiling and memory layout because of the texture unit capabilities (using “may” because I don’t know for sure, but it wouldn’t surprise me).