cross-process texture management

I am working on a 3D volume rendering application, which download volume data to the graphic card as 3D textures and do ray casting. To maximize performance I would like to make my 3D textures resident as much as possible. So I create all the textures with TexImage3D(), sample them in my shader, and let OpenGL do all the texture management. This works fine. As a result if I am rendering a 600MB dataset on a card with 512MB VRAM, my program will take all video memory available.

The problem arises when I want to start a second instance of my program to render another dataset at the same time. I am unable to allocate any textures because there is no VRAM available. This supprised me a little bit because I would expect the display driver to do something smart like swapping out textures from the first program and make room for the new textures from the second program. But this doesn’t seem to be the case. So my question is, is this really how it’s supposed to work? If one greedy program grabs all the video memory then all other programs would fail to allocate any textures? If so, is there any workaround I can use to solve my problem? Is this controlled by OpenGL? the display driver? or the operating system?

I wrote a standalone test program to show the issue. Running the program will allocate 100 2D textures (8MB each). The second instance (or the third, depending on the amount of VRAM on your card) will fail to run, even after the first instance delete all its textures. However if the first instance delete its GL context, the second instance will be able to allocate textures again.


#include "windows.h"
#include <iostream>
#include "gl/glew.h"
#include "gl/glut.h"

using namespace std;

int main(int argc, char* argv[])
{
    // create gl context
    glutInit(&argc, argv);
    glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGBA | GLUT_DEPTH);
    glutCreateWindow("Test");
    GLenum err = glewInit();
    if (GLEW_OK != err)
    {
        cout << "Error: " << glewGetErrorString(err) << endl;
    }

    int nTextures = 100;
    GLuint* textures = new GLuint[nTextures];
    glGenTextures(nTextures, textures);
    // create FBO
    GLuint fbo;
    glGenFramebuffersEXT(1, &fbo);
    glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fbo);

    for (int i = 0; i < nTextures; ++i)
    {
        // create texture
        glBindTexture(GL_TEXTURE_2D, textures[i]);
        glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
        glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP);
        glTexImage2D(   GL_TEXTURE_2D, 0, GL_RGBA16F_ARB, 
                        1024, 1024, 0, GL_RGBA, GL_UNSIGNED_BYTE, NULL);
        // attach texture to fbo
        glFramebufferTexture2DEXT(  GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, 
                                    GL_TEXTURE_2D, textures[i], 0);
        glClear(GL_COLOR_BUFFER_BIT);

        GLenum status = glCheckFramebufferStatusEXT(GL_FRAMEBUFFER_EXT); 
        if (status != GL_FRAMEBUFFER_COMPLETE_EXT)
        {
            cout << "Error found (fbo): " << status << endl;
            cout << "press Enter to exit." << endl;
            getchar();
            return 0;
        }

        GLenum error = glGetError();
        if (error != GL_NO_ERROR)
        {
            cout << "Error found (gl): " << error << endl;
            cout << "press Enter to exit." << endl;
            getchar();
            return 0;
        }
    }
    glFramebufferTexture2DEXT(  GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, 
                                GL_TEXTURE_2D, 0, 0);
    cout << "press Enter to delete textures." << endl;
    getchar();
    glDeleteTextures(nTextures, textures);
    glDeleteFramebuffersEXT(1, &fbo);
    delete [] textures;
    cout << "press Enter to delete gl context." << endl;
    getchar();
    HGLRC rc = wglGetCurrentContext();
    wglMakeCurrent(NULL, NULL);
    wglDeleteContext(rc);
    cout << "press Enter to exit." << endl;
    getchar();
    return 0;
}

What video card/driver/OS are you using?

nVidia 9800 GTX+ with the latest driver. Windows XP
Thanks

With glFramebufferTexture2DEXT you attach each and every texture, quite possibly marking it to be non-swizzled.
(speculation) It may be a limit to geforces that rendertargets cannot be outside VRAM. It’s also noted in the NV performance books that FBOs should be the first to be allocated.

Anyway, here’s a test of your code, slightly changed (150 textures instead of 100, doesn’t delete textures and fbo, doesn’t exit)
http://dl.dropbox.com/u/1969613/openglForum/megaFBO.png
WinXPSP2, 2GB DDR3, GTX275 (896MB)
Runs in a pure 3.2 context (modified the code to use the core funcs instead of EXT), drivers are 195.62

There’s a slight chance the gtx275 handles textures/targets in an improved way.

P.S. the simplistic scene should run at 1000+ fps, with the msaa 4x fbo. I bet this fbo’s rendertargets were evicted to SysRAM. Also, my vsync is 60Hz, is disabled in the app. The 50fps and 100fps values are a coincidence.

A temporary workaround if no general solution comes-up would be to create only one rendertarget, a 2D one. Instead of picking a given tex3D slice to render to, you render to this tex2D one, and then copy the texels to the slice.

I don’t use 3D texture as render target. My program samples 3D textures on GPU and renders directly to 2D textures using FBO. Each instance may create up to 1GB 3D textures and several 2D textures (2MB each).
The issues is, if the first instance creates too much 3D textures, the second instance will be unable to allocate its 2D render targets, because it cannot evict the 3D textures created by the first instance.

According to a friend of mine, this is a limitation of WinXP. I verified that the problem goes away on Win7.

Vista should work fine, too. Under the new driver model, video memory is virtualized and data can be swapped in/out as necessary. There is no such provision under XP.