GTX260 hiccuping bug? (with standalone reproducer)

I’m experiencing a weird hiccuping issue when using floating-point FBOs and issueing many primitives that overlap one another in sequence. It only happens on a particular video card and a particular sequence of interactions, with primitives sent in a particular order, and only intermittently. Tracking this down was a bit of a pain.

The particular triggering example is a set of GL_POINTS issued via client-side glVertexPointer and glDrawArrays (I can’t use display lists or VBOs because the real application data will not fit in video memory). The hiccup is fairly severe (between .2 and .4 seconds when rendering 10^6 points, which is typically instant). I have a 250-line reproducing example that triggers it consistently; I’m posting it at the end of this message. However, it takes a few weird different things put together to trigger the hiccup, so here’s some discussion first. Features of this strange issue:

[ul][li] It does not manifest itself at every frame. I experience a hiccup as I’m zooming in or out of a set of primitives (sometimes I can get the hiccup to appear while the number of in-view frustum primitives changes via panning, but zooming is a consistent way to get it). In particular, it seems to appear when the point overdraw crosses some critical threshold. Then there will be a hiccup or two, and then the app stabilizes and goes back to fast frame rates.[] The ordering of the points on the vertex array matters. If the points are randomly ordered (such that, I suspect, two primitives don’t hit the same spot in the FBO), then no hiccuping happens. However, if I sort the input array so that consecutive primitives are likely to hit the same pixels on the FBO, then hiccuping is likely to happen.[] By profiling the code, I can see that the piece of code in which OpenGL hiccups changes depending on whether I’m zooming in or out (I told you this was a weird issue). When zooming out (during an increase of the amount of overdraw) the hiccup happens during the actual rendering calls (glVertexPointer, glDrawArrays). When zooming in (during a decrease on the amount of overdraw), the hiccup happens on glutSwapBuffers.[] If I render directly to the framebuffer, to a fixed-point FBO, or even to a FP16 FBO, no hiccuping happens. FP32, however, triggers the issue.[] I have run the code on two different Linux setups with two different video cards. My machine at work, which triggers the issue, with a GTX 260, and is a pair of quad-core xeons, 12GB of RAM, x86_64, running Ubuntu 9.10. My home machine does not trigger the issue, and has a GTS 250 with a quad-core i7 920 x86_64, 6GB of RAM, running Ubuntu 10.04.[/ul] [/li]
Here’s an output of the program as I’m zooming in and out:


HICCUP! Draw time: 0.19 Swap time: 0
HICCUP! Draw time: 0 Swap time: 0.32
HICCUP! Draw time: 0.19 Swap time: 0
HICCUP! Draw time: 0.19 Swap time: 0
HICCUP! Draw time: 0.01 Swap time: 0.33

Notice the alternation between draw time and swap time.

To reproduce the issue, compile and execute it the file below as follows (assuming the gcc toolchain):


$ g++ -O2 main.cc -o main -lGLEW -lglut
$ ./main 3

with “3” as the command-line option. Values 0, 1, 2, 3 exercise different code paths with respect to whether or not to use an FP FBO, and whether or not to sort the input points such as to experience the hiccup. To zoom in and out, hold the right mouse button and drag it back and forth. If I zoom out, making the blob of red points appear about 3x as small as it starts, I can get a hiccup pretty consistenly. Zooming back in, I’ll get another hiccup on the way back, and this happens pretty much every time.

Here’s the code:


#include <GL/glew.h>
#include <GL/glut.h>
#include <algorithm>
#include <ctime>
#include <iostream>
#include <iterator>
#include <vector>

using namespace std;

#define gl_check()                                              \
do {                                                            \
    GLenum err = glGetError();                                  \
    if (GL_NO_ERROR != err) {                                   \
        cerr << __FILE__ << ":" << __LINE__ << " - GL ERROR: "  \
             << gluErrorString(err) << endl;                    \
        exit(1);                                                \
    }                                                           \
} while (0)

/******************************************************************************/

struct XYPoint
{
    float m_x, m_y;
};

int window_width = 800, window_height = 800;
float center_y = 0.5f, center_x  = 0.5f, scale = 1.0;
GLuint fbo, renderbuffer_depth, texture_color;
vector<XYPoint> points;
bool use_fbo = true;

/******************************************************************************/
//  points

void render_points()
{
    glBindTexture(GL_TEXTURE_2D, 0);

    if (use_fbo) {
        glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fbo);
    }

    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    glEnableClientState(GL_VERTEX_ARRAY);
    glColor3f(1,0,0);
    glVertexPointer(2, GL_FLOAT, 0, &points[0]);
    glDrawArrays(GL_POINTS, 0, 1000000);
    glDisableClientState(GL_VERTEX_ARRAY);

    // Render texture
    if (use_fbo) {
        glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0);
        glEnable(GL_TEXTURE_2D);
        glBindTexture(GL_TEXTURE_2D, texture_color);
        glMatrixMode(GL_MODELVIEW);
        glLoadIdentity();
        glMatrixMode(GL_PROJECTION);
        glLoadIdentity();
        glBegin(GL_QUADS);
        glTexCoord2f( 0, 0); glVertex2f(-1,-1);
        glTexCoord2f( 1, 0); glVertex2f( 1,-1);
        glTexCoord2f( 1, 1); glVertex2f( 1, 1);
        glTexCoord2f( 0, 1); glVertex2f(-1, 1);
        glEnd();
        glMatrixMode(GL_MODELVIEW);
        glDisable(GL_TEXTURE_2D);
    }
}

/******************************************************************************/

float pixel_to_world_units(float pixels)
{
    return pixels * (2.0 / (window_width * scale));
}

int old_x, old_y;
void pan(int to_x, int to_y)
{
    center_x -= pixel_to_world_units(to_x - old_x);
    center_y += pixel_to_world_units(to_y - old_y);
    old_x = to_x;
    old_y = to_y;
    glutPostRedisplay();
}

void zoom(int to_x, int to_y)
{
    float delta_y_world = pixel_to_world_units(to_y - old_y) * scale * 3;
    old_y = to_y;
    scale *= 1.0 + delta_y_world;
    glutPostRedisplay();
}

void mouse(int button, int state, int x, int y)
{
    old_x = x;
    old_y = y;
    if (state == GLUT_DOWN && button == GLUT_MIDDLE_BUTTON) {
        glutMotionFunc(pan);
    }
    if (state == GLUT_DOWN && button == GLUT_RIGHT_BUTTON) {
        glutMotionFunc(zoom);
    }
    if (state == GLUT_UP) {
        glutMotionFunc(NULL);
    }
}

void display()
{
    float draw_time, swap_time;
    glClearColor(0,0,0,0);
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();
    glOrtho(-1, 1, -1, 1, -1, 1);
    glMatrixMode(GL_MODELVIEW);
    glLoadIdentity();
    glScalef(scale, scale, 1.0f);
    glTranslatef(-center_x, -center_y, 0.0f);

    clock_t t1 = clock(), t2;

    render_points();

    t2 = clock();
    draw_time = ((t2 - t1) / (float)CLOCKS_PER_SEC); 
    t1 = t2;

    glutSwapBuffers();

    t2 = clock(); 
    swap_time = ((t2 - t1) / (float)CLOCKS_PER_SEC); 
    t1 = t2;

    if (draw_time > 0.1 || swap_time > 0.1) {
        cerr << "HICCUP! Draw time: " << draw_time
             << " Swap time: " << swap_time << endl;
    }

    gl_check();
}

void keyboard(unsigned char key, int x, int y)
{
    if (key == 27) {
        exit(1);
    }
}

/******************************************************************************/

bool sort_x(const XYPoint &v1, const XYPoint &v2)
{
    return v1.m_x < v2.m_x;
}

bool sort_y(const XYPoint &v1, const XYPoint &v2)
{
    return v1.m_y < v2.m_y;
}

void reorder_data(std::vector<XYPoint> &v, int mn, int mx, int parity)
{
    if (mx - mn < 1024) {
        return;
    }
    if (parity == 0) {
        sort(v.begin() + mn, v.begin() + mx, sort_x);
    } else {
        sort(v.begin() + mn, v.begin() + mx, sort_y);
    }
    reorder_data(v, mn, (mn + mx)/2, 1-parity);
    reorder_data(v, (mn + mx)/2, mx, 1-parity);
}

/******************************************************************************/

int main(int argc, char **argv)
{
    glutInit(&argc, argv);
    glutInitDisplayMode(GLUT_DEPTH | GLUT_RGBA | GLUT_ALPHA | GLUT_DOUBLE);
    glutInitWindowSize(window_width, window_height);
    int window = glutCreateWindow("Demo");

    glutDisplayFunc(display);
    glutKeyboardFunc(keyboard);
    glutMouseFunc(mouse);

    GLenum err = glewInit();
    if (GLEW_OK != err) {
        cerr << "Error initializing GLEW: " << glewGetErrorString(err) << endl;
        exit(1);
    }
    cerr << "GLEW OK, version " << glewGetString(GLEW_VERSION) << endl;

    for (size_t i=0; i<1000000; ++i) {
        XYPoint point;
        point.m_x = double(rand()) / double(RAND_MAX);
        point.m_y = double(rand()) / double(RAND_MAX);
        points.push_back(point);
    }
    bool sort_points = false;
    int option = atoi(argv[1]);
    use_fbo = (option & 1) == 1;
    sort_points = (option & 2) == 2;
    switch (option) {
    case 0:
        cerr << "Mode 0: no FP FBO, no sorted points - should NOT see hiccups
";
        break;
    case 1:
        cerr << "Mode 1: FP FBO, no sorted points - should NOT see hiccups
";
        break;
    case 2:
        cerr << "Mode 2: no FP FBO, sorted points - should NOT see hiccups
";
        break;
    case 3:
        cerr << "Mode 3: FP FBO, sorted points - SHOULD SEE HICCUPS
";
        break;
    }
    if (sort_points) {
        reorder_data(points, 0, 1000000, 0);
    }
    
    glGenFramebuffers(1, &fbo);
    glGenRenderbuffers(1, &renderbuffer_depth);
    glGenTextures(1, &texture_color);
    glBindTexture(GL_TEXTURE_2D, texture_color);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, 
                 window_width, window_height, 0, GL_RGBA, GL_FLOAT, NULL);
    glBindRenderbuffer(GL_RENDERBUFFER, renderbuffer_depth);
    glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT24, 
                          window_width, window_height);
    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fbo);
    glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, 
                           GL_TEXTURE_2D, texture_color, 0);
    glFramebufferRenderbuffer(GL_DRAW_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, 
                              GL_RENDERBUFFER, renderbuffer_depth);
    glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0);

    gl_check();
    glutMainLoop();
    return 0;
}

I have added calls to glGetError to make sure nothing bad is happening to the state, but the hiccup appears regardless of whether or not I check the errors.
Can anyone else reproduce this? Is there a known workaround? I pretty much cannot move away from FP32 FBOs and the ordering of the points that trigger this issue.

Thank you very much, and sorry for the very long post.

I took the time to compile and run your test program but I do not get any “hiccup” lines to std out.
I ran the app with argument “3” as suggested.
Here is the only output I got on std out:

devel@devel-desktop:/mnt/space1/projects/gtx260hiccup$ ./main 3
GL_RENDERER: GeForce GTX 260/PCI/SSE2
GL_VERSION: 3.2.0 NVIDIA 195.36.15
GLEW OK, version 1.5.3
Mode 3: FP FBO, sorted points - SHOULD SEE HICCUPS
devel@devel-desktop:/mnt/space1/projects/gtx260hiccup$

(I added two lines to your program to output GL_RENDERER and GL_VERSION strings)

Please note that I tried all zoom positions (from the red rectangle on the whole window until it got to a single pixel in the middle of the window and back) and ran the app several times.

The machine is an AMD Phenom II quad core with GTX260 ad 4GB RAM.

nsdev, thanks for trying this out.

This is bizarre - I just tried it again here and I still see the hiccups. What OS are you running?

Ubuntu 9.10 (kernel 2.6.31-20, gcc 4.4.1)
Maybe someone with the same GPU can give the test program a try?