Can't disable multisampling w/ multisample texture

Hello,

This thread is actually a follow-up to an issue initially described in this discussion.

I’m trying to render to multisample textures (ARB_texture_multisample) with multisample disabled through glDisable(GL_MULTISAMPLE). For any given pixel/texel, I expected to get the same color for every sub-sample. However, when fetching individual samples (through texelFetch in GLSL), they don’t always all have the same value.

I developped a small app in order to diagnose the problem.
Code here: http://emmanuel.raulo.free.fr/oglorg/multisamplepb.cpp.
Win32 binaries here (needs VC9 SP1 redist): http://emmanuel.raulo.free.fr/oglorg/multisamplepb-win32.7z

Here is the code in case hosting is too crappy for people to successfully download it (sorry for the long post):


#include <cstdlib>
#include <string>
#include <iostream>
#include <sstream>
#include <GL/glew.h>
#include <GL/glut.h>
#ifndef _WIN32
# include <GL/gl.h>
# include <GL/glu.h>
#endif


//----------------------------------------------------------------------------
// Some global variables...
static GLuint programHandle = 0;
static GLuint samplerLoc = 0;
static GLuint layerLoc = 0;
static GLuint fboHandle = 0;
static GLuint textureHandle = 0;
static GLint layer = 0;
static GLint maxLayers = 0;
static bool multisample=false;


//----------------------------------------------------------------------------
// Shader for displaying multisample buffer content...
static const char* shaderSrc =
  "#extension ARB_texture_multisample : enable                            
"
  "uniform sampler2DMS sampler;                                           
"
  "uniform int layer;                                                     
"
  "                                                                       
"
  "void main() {                                                          
"
  "  gl_FragColor = texelFetch( sampler, ivec2(gl_FragCoord.xy), layer ); 
"
  "}                                                                      
";


//----------------------------------------------------------------------------
// Display callback...
void myDisplay( void )
{
  glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, fboHandle );
  glDrawBuffer( GL_COLOR_ATTACHMENT0_EXT );
  glViewport( 0, 0, 512, 512 );
  glClear( GL_COLOR_BUFFER_BIT );
  if( multisample )
    glEnable( GL_MULTISAMPLE );
  else
    glDisable( GL_MULTISAMPLE );
  glLoadIdentity();
  glRotatef( 22.5f, 0.0f, 0.0f, 1.0f );
  glRectf( -0.5f, -0.5f, 0.5f, 0.5f );
  glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, 0 );

  glEnable( GL_TEXTURE_2D );
  glBindTexture( GL_TEXTURE_2D_MULTISAMPLE, textureHandle );
  glUseProgram( programHandle );
  glUniform1i( samplerLoc, 0 );
  glUniform1i( layerLoc, layer );
  glLoadIdentity();
  glViewport( 0, 0, 512, 512 );
  glRecti( -1, -1, 1, 1 );
  glUseProgram( 0 );
  glBindTexture( GL_TEXTURE_2D_MULTISAMPLE, 0 );
  glDisable( GL_TEXTURE_2D );

  glutSwapBuffers();
}


//----------------------------------------------------------------------------
// Keyboard callback...
void myKeyboard( unsigned char key, int x, int y )
{
  switch( key ) {
  case ' ':
    layer = (layer+1)%maxLayers;
    std::cout << "Now displaying sample " << layer << std::endl;
    glutPostRedisplay();
    break;

  case 'm':
    multisample = !multisample;
    std::cout
      << "Multisample " << ( multisample? "enabled" : "disabled" )
      << std::endl;
    glutPostRedisplay();
    break;

  case 27:
    exit(0);
  }
}


//----------------------------------------------------------------------------
int main( int argc, char **argv )
{
  glutInit( &argc, argv );
  glutInitDisplayMode( GLUT_RGB | GLUT_DOUBLE );
  glutInitWindowSize( 512, 512 );
  glutCreateWindow( "multisample test" );

  try {
    GLenum status = glewInit();
    if( status!=GLEW_NO_ERROR )
      throw std::string("glewInit error: ")+(const char*)glewGetErrorString(status);

    if( !GLEW_VERSION_2_0 )
      throw std::string("OpenGL 2.0 not supported");

    if( !glewIsExtensionSupported("GL_EXT_framebuffer_object") )
      throw std::string("EXT_framebuffer_object not supported");

    if( !glewIsExtensionSupported("GL_ARB_texture_multisample") )
      throw std::string("ARB_texture_multisample not supported");

    GLuint shaderHandle = glCreateShader(GL_FRAGMENT_SHADER);
    glShaderSource( shaderHandle, 1, &shaderSrc, 0 );
    glCompileShader( shaderHandle );

    programHandle = glCreateProgram();
    glAttachShader( programHandle, shaderHandle );
    glLinkProgram( programHandle );
    glUseProgram( programHandle ); // ATi workaround
    samplerLoc = glGetUniformLocation( programHandle, "sampler" );
    layerLoc = glGetUniformLocation( programHandle, "layer" );
    glUseProgram( 0 );

    glGenTextures( 1, &textureHandle );
    glBindTexture( GL_TEXTURE_2D_MULTISAMPLE, textureHandle );
    glGetIntegerv( GL_MAX_SAMPLES, &maxLayers );
    glTexImage2DMultisample( GL_TEXTURE_2D_MULTISAMPLE, maxLayers, GL_RGB8, 512, 512, GL_TRUE );
    glBindTexture( GL_TEXTURE_2D_MULTISAMPLE, 0 );
    status = glGetError();
    if( status!=GL_NO_ERROR )
      throw std::string("Failed to allocate texture: ")+(const char*)gluErrorString(status);

    glGenFramebuffersEXT( 1, &fboHandle );
    glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, fboHandle );
    glFramebufferTexture2D( GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT,
                            GL_TEXTURE_2D_MULTISAMPLE, textureHandle, 0 );
    status = glCheckFramebufferStatusEXT( GL_FRAMEBUFFER_EXT );
    if( status!=GL_FRAMEBUFFER_COMPLETE_EXT ) {
      std::ostringstream str;
      str << "Framebuffer not complete: " << status;
      throw str.str();
    }
    glBindFramebufferEXT( GL_FRAMEBUFFER_EXT, 0 );
  }
  catch( std::string errString ) {
    std::cerr << errString << std::endl;
    return -1;
  }

  glutDisplayFunc( &myDisplay );
  glutKeyboardFunc( &myKeyboard );

  glutMainLoop();

  return 0;
}

Hitting space will cycle through sample indices. Hitting ‘m’ will toggle multisample (initially disabled).

With an nVidia 8800GTX and latest drivers, disabling multisampling will only provide 4 samples with the same value, then 4 other samples with another value, and so on…

I’m looking for a way to broadcast every processed fragment to all of its sub-samples. Is there another way to go?
Can people try this app and tell me if they have more luck (e.g. exact same image for every sample) with some other setup, please?

Help would be greatly appreciated

I get the same behavior on a Quadro FX 5800 with driver 197.44. With multisampling disabled, the sample position changes after every 4th sample… so, samples 0-3 are in the same position, 4-7 are in another, 8-11 in another, and 12-15 in another. I also tried using glBlitFramebuffer to resolve the multisample buffer, and with multisampling disabled, there is still some antialiasing happening (less than with multisampling enabled, it looks like 4 samples vs 16 samples in my case).

Here’s the code I used to resolve the buffer. Also, I had to limit the MSAA texture sample count to 16; my card reports 32 as the max sample count, but you can’t create a normal MSAA texture with that many samples.

glBindFramebufferEXT(GL_READ_FRAMEBUFFER_EXT, fboHandle);
glBindFramebufferEXT(GL_DRAW_FRAMEBUFFER_EXT, 0);
glBlitFramebufferEXT(0, 0, 512, 512, 0, 0, 512, 512,
GL_COLOR_BUFFER_BIT, GL_LINEAR);

I’m not sure if the spec allows this behavior or not, though it seems incorrect. I never noticed it because I only use 4 samples in my code. You should send your repro case to NVIDIA and see what they say. Either post it in the driver sub forum here or send a message to one of the NVIDIA employees who post here (barthold).

I tried your repro on a Radeon 5850 and it works as expected.

It sounds like the driver is implementing 16 sample MSAA as 4x MSAA + 4x SSAA.

That’s exactly what GeForces do. The HW supports up to 4x MSAA natively. This changed with GTX4xx.

Just as a data point, here is what a GeForce GTX285 reports on Linux for available AA modes:


> nvidia-settings --query=fsaa --verbose
...
    Valid 'FSAA' Values
      value - description
        0   -   Off
        1   -   2x (2xMS)
        5   -   4x (4xMS)
        7   -   8x (4xMS, 4xCS)
        8   -   16x (4xMS, 12xCS)
        9   -   8x (4xSS, 2xMS)
       10   -   8x (8xMS)
       12   -   16x (8xMS, 8xCS)

(195.36.07.04 drivers).

As you can see, no 4xMS+4xSS here. Everything is MS and/or CS (coverage sampled AA) except 4xSS+2xMS. They used to have more SS options built in, and on a GeForce 7 even (2xMS+4xSS, 4xMS+4xSS, 4xMS+2xSS) and GeForce 8 (4xMS+4xSS, 8xMS+4xSS), but they got rid of them back in 4Q08.

Strange; I’ve met the problem AlexN describes; and read on Anandtech that only Radeons truly support 8x MSAA, so I was hasty in the conclusion;

Well, an NVidia person would have to say for sure, but…

Since CSAA ~= MSAA, mode 7 is pretty much 8x MSAA in quality. The diff being 4xMS+4xCS should allegedly be faster than 8xMS (perform like 4xMS), and there’ll be a slight visual degredation if there’s any pixel with a boatload of intersecting edges where the CSAA color table overflows.

Hello and thanks a lot for all the replies!

I tried 8 samples this morning and it worked as expected.

I will stick to 8 layers for my K-buffer until a fix/workaround is provided or I have the time to implement depth-peeling instead of stencil routed K-buffer.

By the way, another “funny” limitation I stumbled upon with the nVidia cards is that I can’t attach RGB32F multisample textures with more than 4 samples to a framebuffer object.
I can allocate texture memory OK (checking with glGetError) but glCheckFramebufferStatus() ends up returning GL_FRAMEBUFFER_UNSUPPORTED. I don’t really see the point in being able to allocate such a texture and not being able to use it as a render target (MS texture data can’t be uploaded by any other means).

Ignore all that. Mode 10 says it’s genuine 8xMSAA. :stuck_out_tongue:

G80 and higher supports a true 8x MSAA mode. G7x does not, it only supports a true 4x MSAA mode. On windows, the 8xMSAA mode is called “8xQ” in the control panel on Geforce. On Quadro it is called “8x (8xMS)”.

Quadro has more AA modes than Geforce does. With SLI (either Quadro or Geforce) you get even more modes, and higher modes also.

Barthold
(with my NVIDIA hat on)