glFlush in render to texture cubemap pbuffer

Hi,

I’m trying to use render to texture pbuffer for rendering cubemap texture. As noted in specification of WGL_ARB_render_texture in section ‘Intended Usage’:

  1. Render all the cube map faces to the pbuffer. Call wglSetPbufferAttribARB to set the cube map face before rendering each face. Call glFlush.

But i find strange result, while on nvidia card (GF4600Ti) enough to call glFlush once after all faces rendered on ATI (Radeon9700Pro) I must call glFlush after rendering each face, otherwise faces remains not updated.
Do you know which behaviour is right? For me specification looks more like one must call glFlush once…

Thank you!

This also happens when i am rendering to the backbuffer and then use glCopyTexSubImage to copy the results into a texture (in my no-pbuffer-codepath).
When this texture is accessed, nvidia(drivers) seems to wait until the copy-call is processed, but not ATI. ATI seems simply to supply the old version if the operation is not porcessed at this time. It looks like the ATI drivers would process the geometry/3d-pipeline parallel to the pixel-copy-calls and if one of them is faster, it simply does not wait for the other one, until you call glFlush…

as i said “i looks like”. correct me if i am wrong, but with this assumption in my code, every part of my ATI code works fine…

It seems Nv pipeline the copy into textures, i.e. its command is placed in line, and therefore wont be done until everything before it. While ATI seem to be it immediately. I cant see any reason why it can’t be pipelined, and it doesn’t even have to block either.

Does taking the glFlush out provide any performance increase on an NV card?

Theoretically there should be little impact from removing the flush as it seems that NV are doing one anyway (they are just being smart enough to realise that if you want to copy the data, then it’d better be drawn first).

I wonder if either one of them is more “correct” than the other? Obviously the NV result is what you’d expect to happen, but the ATI result may be just another way of interpreting the Spec. I’d expect the relevant part of the spec relates to the CopyPixels functionality (though if it is happening in both Copy Pixels and Render to texture then that would be two parts of the spec wouldn’t it?).

I’m glad this was brought up (and I’ll be interested to see what ATI have to say) as I develop on NV and very rarely get to play with ATI to test my code (and I’ve just written a large chunk of Copy Tex.

Originally posted by Nutty:
Does taking the glFlush out provide any performance increase on an NV card?

In practice too (as was theoretically assumed by rgpc), on nVidia, in both case with one glFlush after rendering all faces or with six glFlush after each face the speed is the same.