faster

Hello,
What is faster, copying the data from the framebuffer to the accumulation buffer, and then back to the framebuffer, or to use glReadPixels, and then use glCopyPixels to copy them back to the framebuffer?
Thanx in advance,
Hylke

“give a man a fish and you feed him for a day, teach him how to fish and you feed him for a lifetime”

Implement both and see which one is faster.

Cheers,
Peter

Nothing beats rendering to texture, so I suggest you to consider using Framebufer Object extension or simply glCopyTexSubImage function (FBO’s will be faster).

i was told that on NVIDIA gpus, rendering to the framebuffer then copying into a texture was (curiously) the fastest way of achieving render to texture

the man who told me that based his statements on directx experiments but according to him, it should also be the case with opengl.

so, can someone please enlight me on this ??? on nvidia hardware, will FBO be faster than glCopyTexSubImage ??? does it also apply to ati ?

many thx,
g.

In my first experimental FBO application, I also observed that CopyTexSubImage was faster than rendering to an FBO. This was with the first FBO capable driver (on Linux, GFFX).

But with a later app, I could not reproduce the effect. I guess it heavily depends on how you use the FBO, how much you render to it and how often you switch FBO. My test application had only 4 spheres with cubemapping in the whole scene, but it did 24 passes to FBO and one onscreen pass per frame, so it was a bit of an extreme test case.

I guess the best way to find out is to implement both and try it out in your scene, I’m pretty sure depending on a lot of different factors both could potentially be faster.

Ok, thanks for the replies :slight_smile:
But how do I render those pixels back to the framebuffer? Just using glDrawPixels?
Hylke

I’ve found that copying to a texture is still faster than rendering directly to a texture. From what I can tell, early Z (aka hierarchical Z… apparently they’re the same thing) is not enabled for render to texture (rendering to a texture rectangle doesn’t seem to have this limitation). My guess as to why is that the video card stores texture data swizzled (swizzled array, not a color swizzle) into a more cache friendly manner so a linear memory fetch can load a 2D region of a texture; that said, there is probably dedicated hardware to render to this swizzled format which has to disable the hierarchical Z stage. Note that this performance information was gathered from NVidia cards and may not pertain to ATI cards, or ATI may have different limitations entirely.

Kevin

Another point to consider:
Antialiasing may not be available for rendering to the texture.

Also, how do you render to the framebuffer aswell as a texture efficiently when not using copytex* (say you want to output to the framebuffer as per usual, but you also want to render to a texture)?

I should probably get up to speed with FBOs (I’ve only skimmed over the spec) before asking such a question.

Needing the same scene both on- and offscreen is one of the weaknesses of FBO.

For the color buffer it’s trivial. Just render the scene to a texture via FBO, then render a textured fullscreen quad to the screen. Unfortunately, for the more common case, the depth buffer, I’ve not found a way to do this, except of course rendering the depth pass twice.

But if you really only need to have one image both in a texture and on screen, and you don’t need such things as a float color buffer, you’re propably faster when you just use CopyTexSubImage…

Hell, I can’t even get > 8-bit precision reading (NOT COMPARING) depth component textures :/. (I have a 9600, so what are my options, 16/32bit fp textures?

quorpse, show how you create, initialize and use them.

Overmind, can you develop why this is the weakness of FBO ? But a misunderstanding, this is false to my point of view.

Creation:

glGenTextures(1,&texid);
glBindTexture(GL_TEXTURE_2D, texid);
glTexImage2D(GL_TEXTURE_2D, 0,GL_DEPTH_COMPONENT24, texwidth, texheight, 0, GL_DEPTH_COMPONENT, GL_UNSIGNED_INT, 0);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_S,GL_CLAMP);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_WRAP_T,GL_CLAMP);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D,GL_DEPTH_TEXTURE_MODE_ARB,GL_LUMINANCE);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_COMPARE_MODE_ARB,GL_NONE);

Usage:

Copying it with copytexsubimage.
Usage in the fragment shader is with a sampler2D:
float depth = texture2D(depthmap,gl_TexCoord[0].st).r;

Well, what do you propose I should do if I need the same depth buffer on screen and in a renderbuffer? It’s simply not possible with FBO.

There are only two workarounds I found so far:

  1. Obviously, rendering the scene twice works, but I wish to avoid that because my scene is very vertex heavy.
  2. I do everything in FBOs, even the final pass, and transfer the result texture to the screen with a fullscreen quad, but then I loose multisampling.

Both workarounds don’t really solve the problem, they just eliminate the need by doing additional work.

What I would want to do is simply bind the depth renderbuffer to the default framebuffer (or the other way round, doesn’t matter for me because I don’t need the depth in a texture).

It’s not something I can’t live with, especially solution 2 combined with the (hopefully) upcoming FBO multisample extension, seems like a good compromise, a single non-shader fullscreen quad isn’t that much work. But it’s a bit inconvenient.

Overmind, a similar issue I’m facing, but I actually need the depth in a texture too. What I tried was rendering depth as colour to a 32bit fp texture and then rendering the fullscreen quad with that texture, altering gl_Fragdepth to match the depth in the texture (thus avoiding the extra geometry pass). This unfortunately has sever precision/consistency implications and I found the resulting depth precision useless.

You’re right Overmind. I didn’t understood your previous post like that. There’s absolutely no way to deal with the same buffer on both on screen and fbo without whether loose performance or quality. At least this was what I thought but you confirmed it.

Fortunately this is not always the case. I mean I mainly need different buffers (ie for shadow mapping where fbo is made from light’s POV or cubemap…).

But obviously (?) there’s need for some kind of buffers exchanges between onscreen and FBOs. However some of you might put me wrong because of maybe other things avoiding this to be done.

qorpse, I don’t see anything wrong. It might came from the driver I guess.

So, anyone have any ideas on what I can try to get useful depth reads in a fragment shader? This is really killing motivation.

qorpse… your question seems a bit unrelated to this thread, so you might want to post a new thread or point me to another thread you have opened. I have an answer to your question and will be glad to provide you with an answer on an appropriate thread (would be a bit more useful to others that are having a similar problem). =)

Kevin B

I would be eternally thankful for any help, I opened a thread a while ago.

Topic is “Useful depth texture reads.” in the GLSL forum.

Sorry for my latching on to this thread.