I am trying to achieve maximum AA quality using off-screen rendering, with framebuffer objects.
I also use Brian Paul’s Tile Rendering Library, since I want unlimited rendering sizes.
The code below uses 3 framebuffer objects, each with 1 renderbuffer, first one is large and multisampled, second one is just large and the third is small. They are in GL_ALPHA8 format.
I am drawing to the first one, blitting to the second, which has the same sizes, and then from the second I am downsampling with BlitFrameBuffer to the third one.
The maxaa and dsamp (downsampling) parameters control the quality of this operation. Maxaa is set to 8, while tile size varies between 256 and 8192 and dsamp between 1 and 8.
With dsamp = 4 I should get an overall AA = 8 * 4 * 4 = 128, or with dsamp = 2 I should get only AA = 8 * 2 * 2 = 32 (I’m not very sure that I’m getting this quality).
I discovered that I can only use something like tsize = 1024 and dsamp = 2 or tsize = 512 with dsamp = 4 or tsize = 256 and dsamp = 8.
Larger tsize = 2048 an dsamp = 1 works, but it’s lacking downsampling. Higher tsize fails, although in my previous tests with a single mutisampled buffer I used even 8192 as tsize.
To ensure correctness of the output with various tsize and dsamp values I modified the Tile Rendering Library this way:
In trBeginTile I multiplied the glViewport and glOrtho parameters with dsamp. I also multiplied my geometry with dsamp.
My system is i7-920 6GB, 4850 1GB, Vista64 HP, drivers 10.2. Please try to explain to me why my program fails with high tsize. Thank you.
trc1 = trNew() ; // tsize means tile size, trc1 is the TR context
trImageSize( trc1, width2, height2 ) ; // destination image sizes
trTileSize( trc1, tsize, tsize, 0 ) ;
trSetup( trc1 ) ;
glDisable( GL_ALPHA_TEST ) ; // geometry is 2D only, lots of quads or quad strips
glDisable( GL_DEPTH_TEST ) ;
glDisable( GL_STENCIL_TEST ) ;
glPolygonMode( GL_FRONT, GL_FILL ) ;
glEnableClientState( GL_VERTEX_ARRAY ) ;
glPixelStorei( GL_PACK_ALIGNMENT, 1 ) ;
// Create three FBOs, one large and multisampled, one large and one small
glGenFramebuffersEXT( 3, fbo ) ;
glGenRenderbuffersEXT( 3, rendbuf ) ;
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, fbo[0] ) ;
glBindRenderbufferEXT( GL_RENDERBUFFER_EXT, rendbuf[0] ) ;
glRenderbufferStorageMultisampleEXT( GL_RENDERBUFFER_EXT, maxaa, GL_ALPHA8, tsize * dsamp, tsize * dsamp ) ;
glFramebufferRenderbufferEXT( GL_DRAW_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_RENDERBUFFER_EXT, rendbuf[0] ) ;
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, fbo[1] ) ;
glBindRenderbufferEXT( GL_RENDERBUFFER_EXT, rendbuf[1] ) ;
glRenderbufferStorageEXT( GL_RENDERBUFFER_EXT, GL_ALPHA8, tsize * dsamp, tsize * dsamp ) ;
glFramebufferRenderbufferEXT( GL_READ_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_RENDERBUFFER_EXT, rendbuf[1] ) ;
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, fbo[2] ) ;
glBindRenderbufferEXT( GL_RENDERBUFFER_EXT, rendbuf[2] ) ;
glRenderbufferStorageEXT( GL_RENDERBUFFER_EXT, GL_ALPHA8, tsize, tsize ) ;
glFramebufferRenderbufferEXT( GL_READ_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, GL_RENDERBUFFER_EXT, rendbuf[2] ) ;
// we can’t blit directly from a multisampled buffer with ReadPixels, and source & destination must have the same size
cdmap1c = (uchar *)malloc( width2 * height2 ) ; // allocate destination buffer, grayscale 8bpp image
trImageBuffer( trc1, GL_ALPHA, GL_UNSIGNED_BYTE, (void *)cdmap1c ) ;
trOrtho( trc1, 0, (double)width2, 0, (double)height2, -1.0, 1.0 ) ;
glTranslatef( 0.375, 0.375, 0.0 ) ;
glColor4ub( 0, 0, 0, 0 ) ;
// compute geometry’s 2D vertices
moretiles = 1 ; // start Tile Rendering loop
while( moretiles ){
trBeginTile( trc1 ) ;// setup
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, fbo[0] ) ; // fbo[0] is draw target
// draw many quads or quad strips
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, fbo[0] ) ; // fbo[0] is now read target
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, fbo[1] ) ; // and fbo[1] is draw target
glBlitFramebufferEXT( 0, 0, tsize * dsamp, tsize * dsamp, 0, 0, tsize * dsamp, tsize * dsamp, GL_COLOR_BUFFER_BIT, GL_LINEAR ) ;
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, fbo[1] ) ; // fbo[1] is now read target
glBindFramebufferEXT( GL_DRAW_FRAMEBUFFER_EXT, fbo[2] ) ; // fbo[2] is draw target
glBlitFramebufferEXT( 0, 0, tsize * dsamp, tsize * dsamp, 0, 0, tsize, tsize, GL_COLOR_BUFFER_BIT, GL_LINEAR ) ;
glBindFramebufferEXT( GL_READ_FRAMEBUFFER_EXT, fbo[2] ) ; // after blitting, make fbo[2] read target
moretiles = trEndTile( trc1 ) ; // reading pixels
}
tifgrywr( pathout, cdmap1c, width2, height2, (uint)outres ) ; // saving to a grayscale TIFF file
// freeing various resources
glDeleteRenderbuffersEXT( 3, rendbuf ) ;
glDeleteFramebuffersEXT( 3, fbo ) ;
trDelete( trc1 ) ;
PS. Please also comment on ways to improve the speed of the operations above, it’s not as fast I hoped for.
PS2. The maximum AA given by glGetIntegerv( GL_MAX_SAMPLES, maxaa ) is 8 for my 4850.
But I noticed in CCC that the Edge-detect AA can go upto 24, how can I reach 24?
PS3. I decided to try a lower maxaa, instead of the 8 maximum allowed on my 4850. With maxaa = 4 I was able to use higher values for tsize and dsamp, including tsize = 512 and dsamp = 8, for a total AA of 256x . Or maxaa = 2, tsize = 512 and dsamp = 16 for a total AA of 512x (well, this is overkill).
When tsize = 512 and dsamp = 16, the size of the large FBOs are 8192x8192, but any MSAA larger than 2 fails. Maybe my 1GB of video RAM isn’t enough?