Closed-loop FBO operation, practical application

Hello,

I was wondering whether there’s any 100% answer on the closed-loop FBO operations.

Some people say that closed-loop FBO operations don’t work at all, while others say that it is possible as long as the read/write operations are done on different textures (texture attachments).

I tried experimenting with differed shading using single FBO. However, while rendering the pre-shading maps was done just fine, the shading rendering produced some strange results (random I should say). This was done using read of textures used in texture attachments 0-3, outputting into texture from the attachment 4, without any extra FBO operations except for calling step-related glDrawBuffers operation in order to enable specific texture outputs. The shading process is actually disabled for testing purposes, it simply outputs unchanged values from the input color map.

Is there any other step that needs to be done in order to allow that (assuming the question to the 1st question is “yes”)?

I’ve read about the texture barrier, but there are very little actual application examples. Also, I believe it is not applicable to OGL 3.2 (I can switch to 4.5, but so far I had no reason to). I would like to avoid using 2 FBOs with the same textures for ping-ponging, altho it might be the only solution with decent performance.

The result:
[ATTACH=CONFIG]1592[/ATTACH]

What is a “closed-loop FBO operation”? Google says nothing about this term, so it’s unclear who the hypothetical “some people” and “others” are who talk about them.

If you’re talking about FBO feedback loops (ie: rendering to an attached image while reading from it), that is well-covered. Under standard GL 3.x rules, you cannot read from any image that is attached to the framebuffer, period. Under NV/ARB_texture_barrier/OpenGL 4.5, things are more relaxed.

Of course, NV_texture_barrier is widely implemented, so you probably already have this capability.

Yeah that’s exactly it. Closed-loop = feedback operation.

If you’re talking about FBO feedback loops (ie: rendering to an attached image while reading from it), that is well-covered. Under standard GL 3.x rules, you cannot read from any image that is attached to the framebuffer, period. Under NV/ARB_texture_barrier/OpenGL 4.5, things are more relaxed.

The page that you provided mentions “Similarly, if you wrote to an image, then want to read the data you wrote, you can issue the barrier instead of having to detach the image. You can use Write Masks or Draw Buffer state to prevent writing while you are reading.”. This complements your comment about OpenGL 4.5 being more relaxed.

I am actually using drawBuffers(n, buffs) function between each operation, enabling different texture attachments for each step of my render. I was under the impression that it, by itself, would block writing to the texture I’m reading from (not that I’m doing it). However, since it results in random output, even with OpenGL 4.5 used I was wondering whether there’s something else needed.

I need to do a lot more research on NV_texture_barrier before I figure out how to use it, so I would like to avoid it for now.

Of course, NV_texture_barrier is widely implemented, so you probably already have this capability.

Thanks for the link.

The draw buffer state and write masks prevent the texture from being modified, but the pixels are still deemed to have been modified so far as the memory model is concerned. The implementation is free to discard any cached data for those pixels in all attached textures regardless of draw buffer state and write masks, so reading may yield garbage.

Well, there’s no way to know what’s really going on from just a description, but as stated in the wiki page:

This remains unchanged even with 4.5:

Visibility of previously written data cannot be achieved unless you actually detach the images/change which FBO is currently being rendered to, or issue a texture barrier.

Look to https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_texture_barrier.txt .

Another thing that can give similar functionality (but not using FBOs) is to use and abuse https://www.khronos.org/registry/OpenGL/extensions/ARB/ARB_shader_image_load_store.txt

Visibility of previously written data cannot be achieved unless you actually detach the images/change which FBO is currently being rendered to, or issue a texture barrier.

[STRIKE]Can glTextureBarrier() be used for the entire image? The wiki page only provides example regarding piece-wise rendering. Should it be called before each draw call, or just before the step (so before first draw call after the shader program has been changed)?[/STRIKE] Nevermind, it is mentioned, just lower…

How is glTextureBarrier() doing when it comes to performance, is it an improvement compared to FBO binding/unbinding?

Thanks.

How is glTextureBarrier() doing when it comes to performance, is it an improvement compared to FBO binding/unbinding?

If it wasn’t cheaper than changing the FBO, why would they bother adding it?

If it wasn’t cheaper than changing the FBO, why would they bother adding it?

I wasn’t even sure if I needed it. Now that I know I do, I wonder about its performance.

glTextureBarrier is essentially a cache flush; So essentially that means all render caches are flushed to video memory and the texture caches are invalidated.

Also note that doing glTextureBarrier() means that any given draw call must NOT have any overlapping pixels and if two draw calls overlap in screen space, then a glTextureBarrier needs to be between them.

If one wants to have one’s fragment shader read from the surface of a framebuffer -at- the location of its invocation, I (personally) prefer to use GL_ARB_shader_image_load_store together with GL_ARB_fragment_shader_interlock, but that can have negative consequences as well (on some platforms, lossless color buffer compression is disabled on a surface if one access the surface through GL_ARB_shader_image_load_store and the interlock forces ordering in screen space which can be ungood too for performance).

There is also, for GLES, the extension GL_EXT_shader_framebuffer_fetch. If you are using Mesa with Intel hardware, one can enable this in GL (with a different extension name) if one is willing to hack the driver.

[QUOTE=kRogue;1288997]glTextureBarrier is essentially a cache flush; So essentially that means all render caches are flushed to video memory and the texture caches are invalidated.

Also note that doing glTextureBarrier() means that any given draw call must NOT have any overlapping pixels and if two draw calls overlap in screen space, then a glTextureBarrier needs to be between them.

If one wants to have one’s fragment shader read from the surface of a framebuffer -at- the location of its invocation, I (personally) prefer to use GL_ARB_shader_image_load_store together with GL_ARB_fragment_shader_interlock, but that can have negative consequences as well (on some platforms, lossless color buffer compression is disabled on a surface if one access the surface through GL_ARB_shader_image_load_store and the interlock forces ordering in screen space which can be ungood too for performance).

There is also, for GLES, the extension GL_EXT_shader_framebuffer_fetch. If you are using Mesa with Intel hardware, one can enable this in GL (with a different extension name) if one is willing to hack the driver.[/QUOTE]

So that basically means that I can’t any of the reduced draw call techniques with a single FBO?

If by “reduced draw call techniques”, you’re talking about things like instancing, AZDO and the like, sure you can. So long as all of the stuff between barriers never overlaps anything else in that series of draws.

Read/modify/write operations are not cheap. They’re not things you should do willy-nilly, and they’re usually not something you do . Most such techniques involve full-screen passes, where AZDO techniques are essentially irrelevant.

Also, ARB_fragment_shader_interlock exists for dealing with similar circumstances. But only some hardware supports it.

Thanks for the heads up. I will keep that in mind.