Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 3 123 LastLast
Results 1 to 10 of 27

Thread: Resource leak in 2xx nvidia drivers

  1. #1
    Intern Newbie
    Join Date
    Apr 2002
    Location
    Berlin
    Posts
    49

    Resource leak in 2xx nvidia drivers

    Hi,

    we are experiencing a strange out of memory condition with nvidia drivers on Vista 64. It happened very rarely with the 1xx driver series but drastically increased with the 2xx drivers (Several times a day).

    The problem:
    After working with our OpenGL app for some time suddenly all kinds of OpenGL calls fail. Most of the time binding an FBO fails with GL_FRAMEBUFFER_UNSUPPORTED but also other calls fail with GL_OUT_OF_MEMORY. I managed to get GLexpert running and it turned out that the FBO error is also caused by an out of memory condition. The driver seems to delay the allocation of render buffers and textures to the point they are first bound to an FBO in some cases and the GLexpert messages indicate that there is not enough memory for this leading to GL_FRAMEBUFFER_UNSUPPORTED. What also happens is that the driver falls back to software emulation from time to time. These problems occur sooner and sooner until even starting the application doesn't work anymore and it eventually results in crashes in nvogl.dll. It helps a little bit to kill dwm.exe. This seems to free up at least some memory but at some point only rebooting helps. We have reports from Windows XP users that even worse things happen there resulting in bluescreens and corruption of other windows but it is not 100% sure this is related.


    "Repro case":
    It seems like it happens more often in certain scenarios when the rendering loop is paused for a long time. For instance our editor app displays various preview windows and when cooking a project it launches a player app to check if the cooked projects actually runs. During this time the editor render loop is paused. The same thing happens in one of our tests. A lot of resources are created and released but almost no rendering occurs. This is complete guesswork though and might be completely coincidental.


    What I tried:
    I tried to somehow deal with this for month now. Even if our app does something wrong this is definately a driver issue, isn't it? Even if we leak some resources the driver should do cleanup when the application exits right? I checked for leaking OpenGL resources in our app using gDebugger but it didn't find anything. The memory of the app itself is not increasing that much, too.


    Do you have any suggestions what I can do to to find the source of the problem? I guess our application must do something different or otherwise a lot more people would have complained about this. Is there any way to get more information from the driver to know what's actually leaking? Even if I was able to make a repro case it seems like it is not possible to submit a bug to nvidia anymore. Some years before I was a registered developer but now my account seems to be deleted and all my attempts to register again seem to be ignored.

    I hope this is not the wrong forum but I don't know where else to post. This really becomes big problem for us now and I'm getting desperate. Thanks for your time.

  2. #2
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,198

    Re: Resource leak in 2xx nvidia drivers

    Are you calling any glGen* functions per-frame? Yes, the driver will clean up "leaked" resources on shutdown, but if you are allocating new resources each frame you may run out of memory well before the driver gets the chance to do this.

  3. #3
    Intern Newbie
    Join Date
    Apr 2002
    Location
    Berlin
    Posts
    49

    Re: Resource leak in 2xx nvidia drivers

    Thank you for your reply. No. I don't usually use glGen* calls per frame. At some places textures are generated on the fly when they are accessed first but this is not happening very often and the error also happens when this lazy texture generation is disabled and everything is precached. I also do not use very many FBOs. Actually there is only one FBO for rendering to textures. It is reused all the time because we have very many textures we render to (hundreds).

    May be I didn't explain it clearly. The driver leaks memory over several starts of the application. At some point when I restart our app I cannot even create a single render target texture without getting an FBO error caused by an out of memory condition in the driver and only a reboot fixes this. One of the very first things that happens in our renderer is the creation of a 32x32 RGBA dummy texture. We render a checkerboard pattern to it and use it as dummy texture whenever loading a texture fails. Even this doesn't work from time to time.

  4. #4
    Intern Newbie
    Join Date
    Apr 2002
    Location
    Berlin
    Posts
    49

    Re: Resource leak in 2xx nvidia drivers

    I tried to use this extension:
    http://developer.download.nvidia.com...emory_info.txt

    When the error occurs none of the numbers indicate any memory shortage.

    GL_GPU_MEMORY_INFO_DEDICATED_VIDMEM_NVX=786432
    GL_GPU_MEMORY_INFO_TOTAL_AVAILABLE_MEMORY_NVX=7864 32
    GL_GPU_MEMORY_INFO_CURRENT_AVAILABLE_VIDMEM_NVX=63 4308
    GL_GPU_MEMORY_INFO_EVICTION_COUNT_NVX=26
    GL_GPU_MEMORY_INFO_EVICTED_MEMORY_NVX=38060

    This is what GLlExpert prints when the error occurs:
    "OGLE: Category: 0x00004000, MessageID: 0x008E0000
    Basic framebuffer object information: The COLOR_ATTACHMENT0 attachment is unsupported, because it is not allocated (out of memory)."

    Sometimes also this when calling glGenFrameBuffers:
    OGLE: Category: 0x00000002, MessageID: 0x00810008
    Software rendering has been enabled because the current framebuffer related state is not supported with the current hardware configuration: The framebuffer is not a hardware accelerated resource.


  5. #5
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,209

    Re: Resource leak in 2xx nvidia drivers

    Quote Originally Posted by muhkuh
    I tried to use this extension:
    http://developer.download.nvidia.com...emory_info.txt

    When the error occurs none of the numbers indicate any memory shortage.

    GL_GPU_MEMORY_INFO_DEDICATED_VIDMEM_NVX=786432
    GL_GPU_MEMORY_INFO_TOTAL_AVAILABLE_MEMORY_NVX=7864 32
    GL_GPU_MEMORY_INFO_CURRENT_AVAILABLE_VIDMEM_NVX=63 4308
    GL_GPU_MEMORY_INFO_EVICTION_COUNT_NVX=26
    GL_GPU_MEMORY_INFO_EVICTED_MEMORY_NVX=38060
    While I'm no NVidia driver guru, I use NVX_gpu_memory_info, and in my experience, the EVICT numbers are always 0, unless you've bang your head up against the limits of GPU memory at some point. I'd restart your system, verify the EVICT numbers are 0, start your app, and watch the numbers.

  6. #6
    Intern Newbie
    Join Date
    Apr 2002
    Location
    Berlin
    Posts
    49

    Re: Resource leak in 2xx nvidia drivers

    Interesting. Even after rebooting these numbers are not zero for me. Never. After testing a lot yesterday it seems like these numbers grow after every start of our editor. I don't even have to load any resources. It is enough to start the editor and close it. It also happens with one of our test executables which in one test creates a second window with a second render context (the editor does not do this though). In contrast to that I can run our player app all day long without changing eviction count at all.

    I'm now running a second little glut app that does no more than continuously printing out these numbers to see when exactly these counts increase. In the editor a window handle from .NET Winforms is passed to native code, a context is created and then these counts suddenly increase while something is going on inside .NET. But I guess it's not easily possible to debug it this way because the driver will probably do things asynchronously in another thread so the time I observe the change might be after the command that is responsible for that. May be it's easier to track this in the other app where everything is under my control more or less.

    Edit:
    Are you running Vista/Win7 or WinXP. I read that these numbers are global for all applications on Vista/Win7 and local to the application on WinXP. I'm running Vista64 so may be it is normal that there is always something evicted in the driver globally. When you are running WinXP this might be the reason the eviction counter is always zero for you as it is local to the GL context.

  7. #7
    Intern Newbie
    Join Date
    Apr 2002
    Location
    Berlin
    Posts
    49

    Re: Resource leak in 2xx nvidia drivers

    I've discovered that resizing a hidden gl window causes GL_GPU_MEMORY_INFO_EVICTION_COUNT_NVX and GL_GPU_MEMORY_INFO_EVICTED_MEMORY_NVX to increase. This happens when starting the editor because it creates the render window as a child of the main window while the main window is still hidden. We can work around the issue I think. I wrote a simple test case using glut that shows the issue. I'm still checking if this is really the problem though.

    Here is the test case source in case someone is interested. It continuously resizes a hidden window and prints out evicted count, evicted memory and the currently available video when the evicted count changes.


    My spec:
    Operating System: Windows Vista (TM) Ultimate, 64-bit (Service Pack 2)
    GPU processor: GeForce 8800 GTX
    Driver version: 260.99

    Code :
    #pragma comment(lib, "glut32.lib")
    #include "glut.h"
    #include <stdio.h>
     
    #define GL_GPU_MEMORY_INFO_DEDICATED_VIDMEM_NVX 0x9047
    #define GL_GPU_MEMORY_INFO_TOTAL_AVAILABLE_MEMORY_NVX 0x9048
    #define GL_GPU_MEMORY_INFO_CURRENT_AVAILABLE_VIDMEM_NVX 0x9049
    #define GL_GPU_MEMORY_INFO_EVICTION_COUNT_NVX 0x904A
    #define GL_GPU_MEMORY_INFO_EVICTED_MEMORY_NVX 0x904B
     
    void PrintVideoMemory()
    {	
    	static GLint evicted=0;
    	GLint vidmem=0, mem_available=0, vidmem_available=0, evicted_count=0, evicted_size=0; 
    	glGetIntegerv(GL_GPU_MEMORY_INFO_DEDICATED_VIDMEM_NVX, &amp;vidmem);
    	glGetIntegerv(GL_GPU_MEMORY_INFO_TOTAL_AVAILABLE_MEMORY_NVX, &amp;mem_available);
    	glGetIntegerv(GL_GPU_MEMORY_INFO_CURRENT_AVAILABLE_VIDMEM_NVX, &amp;vidmem_available);
    	glGetIntegerv(GL_GPU_MEMORY_INFO_EVICTION_COUNT_NVX, &amp;evicted_count);
    	glGetIntegerv(GL_GPU_MEMORY_INFO_EVICTED_MEMORY_NVX, &amp;evicted_size);
     
    	if (evicted!=evicted_count)
    	{
    		printf("evicted_count=%d, evicted_size=%dkb, vidmem_available=%dkb\n",
    			evicted_count, evicted_size, vidmem_available);
     
    		evicted=evicted_count;
    	}
    }
     
    void renderScene(void) 
    {
    	static bool flip=true;
    	flip=!flip;
     
    	glClear(GL_COLOR_BUFFER_BIT);
    	glBegin(GL_TRIANGLES);	
     
    	glVertex3f(-0.5,-0.5,0.0);
    	glVertex3f(0.5,0.0,0.0);
    	glVertex3f(0.0,0.5,0.0);
    	glEnd();
    	glutSwapBuffers();
    	PrintVideoMemory();
     
    	if (flip)
    		glutReshapeWindow(300,200);	
    	else
    		glutReshapeWindow(300,300);	
    }
     
     
     
    int main(int argc, char* argv[])
    {
    	glutInit(&amp;argc, argv);
    	glutInitDisplayMode(GLUT_RGBA | GLUT_DOUBLE );
    	glutInitWindowPosition(100,100);
    	glutInitWindowSize(100,100);
     
    	glutCreateWindow("resoource leak provoker");
    	glutHideWindow();	//comment this to make the leak disappear
     
    	glutDisplayFunc(&amp;renderScene);
    	glutIdleFunc(&amp;renderScene);
    	glutMainLoop();
     
    	return 0;
    }

  8. #8
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,198

    Re: Resource leak in 2xx nvidia drivers

    Hmmm - maybe check that your program actually is exiting properly and fully? Check in Task Manager that there are no instances of it still running, check your code to ensure that your context is being destroyed, and so on.

  9. #9
    Senior Member OpenGL Guru Dark Photon's Avatar
    Join Date
    Oct 2004
    Location
    Druidia
    Posts
    3,209

    Re: Resource leak in 2xx nvidia drivers

    Quote Originally Posted by muhkuh
    Edit:
    Are you running Vista/Win7 or WinXP. I read that these numbers are global for all applications on Vista/Win7 and local to the application on WinXP. I'm running Vista64 so may be it is normal that there is always something evicted in the driver globally. When you are running WinXP this might be the reason the eviction counter is always zero for you as it is local to the GL context.
    Ah! That's probably it. Vista/Win7 has that perf-wasting Aero compositor which uses the GPU for rendering/compositing the desktop. Whereas XP doesn't. Very well could be that it's the OS that's overflowing your GPU at some point with normal desktop rendering ops, giving you an eviction count > 0.

    I'm running Linux, and with the perf-eating GPU compositor turned off (so this should be more like the XP case). So my app is the only thing GPU intensive that runs, and my eviction counts are 0 unless I get up close to the limit of GPU memory.

    You could try after disabling Aero and rebooting to see if your results differ:

    http://www.howtogeek.com/howto/windo...windows-vista/

  10. #10
    Intern Newbie
    Join Date
    Apr 2002
    Location
    Berlin
    Posts
    49

    Re: Resource leak in 2xx nvidia drivers

    Disabling Aero seems to help. The test case does not cause the evicted count to increase anymore. But I still have to check if this also solves our original problem with various OpenGL calls failing with out of memory errors.

    By the way, this is what the extension spec says about the evicted count in Vista, WinXP and Linux:
    "Implementing the eviction information is OS dependent.
    For XP and Linux the eviction information is specific to the current process/state since eviction is determined in the individual client. For Vista it is system wide since eviction is determined by the OS."

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •