PDA

View Full Version : 97.44 multisample framebuffer_blit still inverted



CatAtWork
12-09-2006, 08:41 AM
Is this bug being worked on? Occurs on an 8800GTX and 1500M.

JeffJ
12-13-2006, 11:10 AM
Yes. It will be fixed in a future driver release. Thank you for taking the time to produce a simple repro app, this greatly helped us.

This issue is caused by a problem that could sometimes happen when doing a downsample blit directly into the window. To work around the issue on the current drivers, you could modify your application to perform the downsample blit into a second single-sample FBO, and then do a 1:1 blit from the single-sample FBO to the window.

CatAtWork
12-13-2006, 11:44 AM
Thanks for acknowledging this. Is a 1:1 inverted blit slower than a regular 1:1 blit?


Here's what I've been using as a workaround, which looks exactly like what you suggested:

GLuint drawFramebuffer = 0;
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, drawFramebuffer );
CHECKGL;

glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, rt.fbo->_handle );
CHECKGL;

GL_CHECK_FRAMEBUFFER_STATUS( GL_FRAMEBUFFER_EXT );
CHECKGL;
GL_CHECK_FRAMEBUFFER_STATUS( GL_DRAW_FRAMEBUFFER_EXT );
CHECKGL;
GL_CHECK_FRAMEBUFFER_STATUS( GL_READ_FRAMEBUFFER_EXT );
CHECKGL;


const int srcWidth = appWindow.width;
const int srcHeight = appWindow.height;


const bool flip = GetBool("r_postProcessFlipFBBlit") ? true : false;
const float scale = GetFloat("r_postProcessScaleFBBlit");

#pragma warning( disable : 4244 )
const int dstWidth = (float)appWindow.width * scale;
const int dstHeight = (float)appWindow.height * scale;
#pragma warning( default : 4244 )

const GLenum filtering = GetBool("r_postProcessFilterFBBlit") ? GL_LINEAR : GL_NEAREST;


{
glBlitFramebufferEXT( 0, 0,
srcWidth, srcHeight,

0, 0,
dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );
}





if( flip ){
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, 0 );
CHECKGL;
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, g_rb1.fbo->_handle );
CHECKGL;

glBlitFramebufferEXT( 0, srcHeight, srcWidth, 0, // reverse Y
0, 0, dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );

glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, g_rb1.fbo->_handle );
CHECKGL;
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, 0 );
CHECKGL;

glBlitFramebufferEXT( 0, 0, srcWidth, srcHeight,
0, 0, dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );

CHECKGL;
}

JeffJ
12-13-2006, 12:24 PM
Originally posted by CatAtWork:
Is a 1:1 inverted blit slower than a regular 1:1 blit? As far as I know, a 1:1 inverted blit should always run at the same speed as a 1:1 non-inverted blit.


Here's what I've been using as a workaround, which looks exactly like what you suggestedAlmost. In spirit these have the same semantics. In practice it looks like your snippet does one more blit than I had in mind.

glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, rt.fbo->_handle );

if( flip ) {
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, g_rb1.fbo->_handle );
} else {
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, 0 );
}

glBlitFramebufferEXT( 0, 0, srcWidth, srcHeight,
0, 0, dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );

if( flip ){
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, g_rb1.fbo->_handle );
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, 0 );
glBlitFramebufferEXT( 0, 0, srcWidth, srcHeight,
0, 0, dstWidth, dstHeight,
GL_COLOR_BUFFER_BIT, filtering );
}

CatAtWork
12-13-2006, 01:08 PM
Oh, I see, the inversion only happens when blitting from a multisample FBO to the window, not from multisample to single-sample.

I've added the two-blit path, but it's significantly slower! Looking into it now.

CatAtWork
12-13-2006, 01:55 PM
I'm not sure why the 2 blit approach is slower than 3, but here's another repro app.
http://www.effloresce.com/cat/opengl.org/fbo_blit_perf-20061213.zip
It's 12megs, because I didn't have a whole lot of time to prune.

I would maximize the window to something large, 1600->1900 width hopefully.

My only thought is that after the maximization the allocation order of the framebuffers is not optimal. They're created when r_postProcessEnable 1 is executed, not at the beginning of the gl context creation.


r_postProcessMultisamples X, (I used 8 and 16)
r_postProcessFlipFBBlit 1, ( enables a flip in the 3 blit path)
r_postProcessEnable 1

r_postProcessAllowFBBlit 2, for the 3 blit path that I posted

r_postProcessAllowFBBlit 3, for your path, Jeff.

r_timeGL 1 for EXT_timer_query -based FPS.

ocean_useShader 1 to perform some heavy per-pixel work.

image_anisotropic 8 or 16 to get rid of the texture2DProj artifacts at a distance